我正在尝试使用python通过以下lib link写入安全hdfs
认证部分:
def init_kinit():
kinit_args = ['/usr/bin/kinit', '-kt', '/tmp/xx.keytab',
'kerberos_principle']
subp = Popen(kinit_args, stdin=PIPE, stdout=PIPE, stderr=PIPE)
subp.wait()
客户端/上传部分:
from hdfs.ext.kerberos import KerberosClient
client = KerberosClient(url='http://xx.com:port', session=session,
mutual_auth="REQUIRED")
client.upload(
f'/hdfspath/file.parquet',
f'/localpath/file.parquet')
这里是错误
requests.exceptions.ConnectionError: HTTPConnectionPool(host='xxx', port=xxx):
Max retries exceeded with url: /webhdfs/v1/user/xxx/xxx.parquet?op=LISTSTATUS (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f499c104d30>: Failed to establish a new connection: [Errno 111] Connection refused'))
我尝试了以下link,并确保启用了dfs.webhdfs.enabled
1条答案
按热度按时间ffdz8vbo1#
原来,我们使用的是https策略,更改了端口和协议,它工作得很好
代码