使用httr(on r)将一个空的主体放到webhdfs中

ua4mk5z4  于 2021-05-27  发布在  Hadoop
关注(0)|答案(2)|浏览(498)

尝试放入webhdfs以创建文件并写入文件时(使用以下链接:https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#create)我在使用中遇到问题 httr .
使用rcurl或rwebhdfs是不可能的,因为目标hadoop集群是安全的。
以下是我尝试使用的代码:

library(httr)
r <- PUT("https://hadoopmgr1p.global.ad:14000/webhdfs/v1/user/testuser/temp/loadfile_testuser_2019-11-28_15_28_41411?op=CREATE&permission=755&user.name=testuser", 
          authenticate(":", "", type = "gssnegotiate"),
          verbose())
``` `testuser` 是具有r/w权限的超级用户。我得到以下错误:

<- HTTP/1.1 400 Data upload requests must have content-type set to 'application/octet-stream'
<- Date: Fri, 29 Nov 2019 15:42:30 GMT
<- Date: Fri, 29 Nov 2019 15:42:30 GMT
<- Pragma: no-cache
<- X-Content-Type-Options: nosniff
<- X-XSS-Protection: 1; mode=block
<- Content-Length: 0

这个错误很容易解释,因此我尝试使用一个内容类型:

r <- PUT("https://hadoopmgr1p.global.ad:14000/webhdfs/v1/user/testuser/temp/loadfile_testuser_2019-11-28_15_28_41411?op=CREATE&permission=755&user.name=testuser",
authenticate(":", "", type = "gssnegotiate"),
content_type("application/octet-stream"),
verbose())

我获得了成功,但这并不是真正的成功:

<- Date: Fri, 29 Nov 2019 16:04:52 GMT
<- Cache-Control: no-cache
<- Expires: Fri, 29 Nov 2019 16:04:52 GMT
<- Date: Fri, 29 Nov 2019 16:04:52 GMT
<- Pragma: no-cache
<- Content-Type: application/json;charset=utf-8
<- X-Content-Type-Options: nosniff
<- X-XSS-Protection: 1; mode=block
<- Content-Length: 0

没有上载的文件。上载带有第一个请求的文件时,会出现另一个错误:

<- HTTP/1.1 307 Temporary Redirect
<- Date: Fri, 29 Nov 2019 16:07:24 GMT
<- Cache-Control: no-cache
<- Expires: Fri, 29 Nov 2019 16:07:24 GMT
<- Date: Fri, 29 Nov 2019 16:07:24 GMT
<- Pragma: no-cache
<- Content-Type: application/json;charset=utf-8
<- X-Content-Type-Options: nosniff
<- X-XSS-Protection: 1; mode=block
Error in curl::curl_fetch_memory(url, handle = handle) :
necessary data rewind wasn't possible

问题代码:

library(httr)
temp_file <- httr::upload_file(lfs_temp_file, type = "text/plain")
r <- PUT("https://hadoopmgr1p.global.ad:14000/webhdfs/v1/user/testuser/temp/loadfile_testuser_2019-11-28_15_28_41411?op=CREATE&permission=755&user.name=testuser",
authenticate(":", "", type = "gssnegotiate"),
body=temp_file,
content_type("application/octet-stream"),
verbose())

尝试使用curl执行相同的命令不会产生问题: `curl -i -k -X PUT --negotiate -u : "https://hadoopmgr1p.global.ad:14000/webhdfs/v1/user/testuser/temp/loadfile_testuser_2019-11-28_15_28_4141?op=CREATE&permission=755&user.name=testuser"` 结果如下:

% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0HTTP/1.1 307 Temporary Redirect
Date: Thu, 28 Nov 2019 23:27:16 GMT
Cache-Control: no-cache
Expires: Thu, 28 Nov 2019 23:27:16 GMT
Date: Thu, 28 Nov 2019 23:27:16 GMT
Pragma: no-cache
Content-Type: application/json;charset=utf-8
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
WWW-Authenticate: Negotiate /
Set-Cookie: hadoop.auth=""; Path=/; Secure; HttpOnly
Location: https://hadoopmgr1p.global.ad:14000/webhdfs/v1/user/testuser/temp/loadfile_testuser_2019-11-28_15_28_4141?op=CREATE&data=true&user.name=testuser&permission=755
Content-Length: 0

跟随 `Location` header允许我们成功地创建文件。
我做错什么了?
谢谢
yptwkmov

yptwkmov1#

httr 正在尝试执行重定向,但失败。要解决问题,请告诉 httr 停止跟踪位置 config(followlocation = 0L) .
put命令如下:

r <- PUT("https://hadoopmgr1p.global.ad:14000/webhdfs/v1/user/testuser/temp/
          loadfile_testuser_2019-11-28_15_28_41411?op=CREATE&permission=755&user.name=testuser", 
          authenticate(":", "", type = "gssnegotiate"),
          body=NULL,
          config(followlocation = 0L),
          verbose())

这将返回一个带有位置头的有效响应。

oxiaedzo

oxiaedzo2#

干得好,包括 curl 输出。我相信这就是答案。
你的 curl 命令使用 PUT ,以及您的 httr 命令使用 POST . 尝试https://www.rdocumentation.org/packages/httr/versions/1.4.1/topics/put .
未来参考提示: POST 如果要指定确切的位置,通常不使用命令。那是什么 PUT 是给你的。

相关问题