使用python写入kerberosed hdfs| URL超过最大重试次数

wvmv3b1j  于 2022-12-09  发布在  HDFS
关注(0)|答案(1)|浏览(219)

我正在尝试使用python通过以下lib link写入安全hdfs
认证部分:

def init_kinit():
    kinit_args = ['/usr/bin/kinit', '-kt', '/tmp/xx.keytab',
                  'kerberos_principle']
    subp = Popen(kinit_args, stdin=PIPE, stdout=PIPE, stderr=PIPE)
    subp.wait()

客户端/上传部分:

from hdfs.ext.kerberos import KerberosClient
client = KerberosClient(url='http://xx.com:port', session=session,
                            mutual_auth="REQUIRED")
client.upload(
        f'/hdfspath/file.parquet',
        f'/localpath/file.parquet')

这里是错误

requests.exceptions.ConnectionError: HTTPConnectionPool(host='xxx', port=xxx): 
Max retries exceeded with url: /webhdfs/v1/user/xxx/xxx.parquet?op=LISTSTATUS (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f499c104d30>: Failed to establish a new connection: [Errno 111] Connection refused'))

我尝试了以下link,并确保启用了dfs.webhdfs.enabled

ffdz8vbo

ffdz8vbo1#

原来,我们使用的是https策略,更改了端口和协议,它工作得很好

<property>
<name>dfs.http.policy</name>
<value>HTTPS_ONLY</value>
</property>

代码

client = KerberosClient(url='https://xx.com:port', session=session,
                            mutual_auth="REQUIRED")

相关问题