使用python读取hdinsight中的配置单元表时出现连接问题

qyyhg6bp  于 2021-06-24  发布在  Hive
关注(0)|答案(1)|浏览(724)

全部。我想用python连接hdinsight中的hive数据库,我关注了多个blog和几个stackoverflow blog,但运气不好。下面是我使用pyhive和jaydebeapi库的尝试。
使用jaydebeapi
我已经将hive-jdbc-1.2.1、httpclient-4.4和httpcore-4.4.4 jars添加到当前的工作目录中,并且已经使用pip install thrift安装了thrift。代码狙击手是

import jaydebeapi

conn = jaydebeapi.connect("org.apache.hive.jdbc.HiveDriver",
       "jdbc:hive2://shaktiman.database.windows.net:443/;ssl=true;transportMode=http;httpPath=/hive2",
       ['admin', 'Abcdeertyoiu@1234'],
       "hive-jdbc-1.2.1.jar")

cursor = conn.cursor()
cursor.execute("select * from default.hivesampletable limit 50")
print(cursor.description)  # prints the result set's schema
results = cursor.fetchall()

但我得到以下错误:

Traceback (most recent call last):
  File "ClassLoader.java", line 357, in java.lang.ClassLoader.loadClass
  File "Launcher.java", line 349, in sun.misc.Launcher$AppClassLoader.loadClass
  File "ClassLoader.java", line 424, in java.lang.ClassLoader.loadClass
  File "URLClassLoader.java", line 382, in java.net.URLClassLoader.findClass
java.lang.ClassNotFoundException: java.lang.ClassNotFoundException: org.apache.hive.service.cli.thrift.TCLIService$Iface

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "org.jpype.JPypeContext.java", line 330, in org.jpype.JPypeContext.callMethod
  File "Method.java", line 498, in java.lang.reflect.Method.invoke
  File "DelegatingMethodAccessorImpl.java", line 43, in sun.reflect.DelegatingMethodAccessorImpl.invoke
  File "NativeMethodAccessorImpl.java", line 62, in sun.reflect.NativeMethodAccessorImpl.invoke
  File "NativeMethodAccessorImpl.java", line -2, in sun.reflect.NativeMethodAccessorImpl.invoke0
  File "DriverManager.java", line 247, in java.sql.DriverManager.getConnection
  File "DriverManager.java", line 664, in java.sql.DriverManager.getConnection
  File "HiveDriver.java", line 105, in org.apache.hive.jdbc.HiveDriver.connect
Exception: Java Exception

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:/Learning Dir/PycharmProjects/Python/HdInsight-Hive/test.py", line 39, in <module>
    "hive-jdbc-1.2.1.jar")
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\jaydebeapi\__init__.py", line 412, in connect
    jconn = _jdbc_connect(jclassname, url, driver_args, jars, libs)
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\jaydebeapi\__init__.py", line 230, in _jdbc_connect_jpype
    return jpype.java.sql.DriverManager.getConnection(url, *dargs)
java.lang.NoClassDefFoundError: java.lang.NoClassDefFoundError: org/apache/hive/service/cli/thrift/TCLIService$Iface

不确定,是什么问题。
我也尝试过使用pyhive,如下所示

from pyhive import hive
conn = hive.connect('hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net', port=10000,auth='NOSASL')
cursor = conn.cursor()
cursor.execute('SHOW DATABASES')
cursor.fetchall()

但我还是觉得很奇怪:

"D:\Learning Dir\PycharmProjects\Python\venv\Scripts\python.exe" "D:/Learning Dir/PycharmProjects/Python/HdInsight-Hive/test2.py"
failed to resolve sockaddr for hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net:10000
Traceback (most recent call last):
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 99, in open
    addrs = self._resolveAddr()
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 42, in _resolveAddr
    socket.AI_PASSIVE | socket.AI_ADDRCONFIG)
  File "D:\Installation\Python\Python38-32\lib\socket.py", line 752, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed
Traceback (most recent call last):
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 99, in open
    addrs = self._resolveAddr()
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 42, in _resolveAddr
    socket.AI_PASSIVE | socket.AI_ADDRCONFIG)
  File "D:\Installation\Python\Python38-32\lib\socket.py", line 752, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:/Learning Dir/PycharmProjects/Python/HdInsight-Hive/test2.py", line 2, in <module>
    conn = hive.connect('hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net', port=10000,auth='NOSASL')
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\pyhive\hive.py", line 94, in connect
    return Connection(*args,**kwargs)
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\pyhive\hive.py", line 192, in __init__
    self._transport.open()
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TTransport.py", line 155, in open
    return self.__trans.open()
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 103, in open
    raise TTransportException(type=TTransportException.NOT_OPEN, message=msg, inner=gai)
thrift.transport.TTransport.TTransportException: failed to resolve sockaddr for hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net:10000

此外,很少有博客建议将hiveserver2传输模式从“http”更改为“binary”。我试过了。但这对我也没有帮助。。。
如果有人能提出一些可行的代码或解决方案,我将不胜感激。提前谢谢。

wlp8pajw

wlp8pajw1#

在我看来是配置/网络问题。
您可以验证从主机(从提交应用程序的主机)到hdi群集的连接(如果从hdi中的头节点提交,则可以忽略该连接)。请尝试在此处使用ip地址- hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net . 您可以通过运行 curl ifconfig.me 在hdi集群内部。
同时尝试检查端口是否未在任何地方使用 telnet . 尝试使用10001
尝试更改值 hive.server2.transport.modehttpbinary 在Ambari

相关问题