首页 > 解决方案 > 将 python 连接到 Hive

问题描述

我想从 python 连接配置单元。出于测试目的,我在 Pycharm 中创建了下面的脚本并尝试连接配置单元

from pyhive import  hive
import sys
import pandas as pd
import ssl
import thrift_sasl
con=hive.Connection(host="ip_addrs",port=10000,username="hiveuser_test", auth='NOSASL')
cursor = con.cursor()
print(cursor.fetchall())
print(con)

运行代码时出现以下错误:

C:\Users\username_dim\AppData\Local\Programs\Python\Python36-32\python.exe C:/Users/username_dim/PycharmProjects/untitled1/Test
Traceback (most recent call last):
  File "C:/Users/username_dim/PycharmProjects/untitled1/Test", line 11, in <module>
    con=hive.Connection(host="ip_addres",port=10000,username="hiveuser_test", auth='NOSASL')
  File "C:\Users\username_dim\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pyhive\hive.py", line 198, in __init__
    response = self._client.OpenSession(open_session_req)
  File "C:\Users\username_dim\AppData\Local\Programs\Python\Python36-32\lib\site-packages\TCLIService\TCLIService.py", line 187, in OpenSession
    return self.recv_OpenSession()
  File "C:\Users\username_dim\AppData\Local\Programs\Python\Python36-32\lib\site-packages\TCLIService\TCLIService.py", line 199, in recv_OpenSession
    (fname, mtype, rseqid) = iprot.readMessageBegin()
  File "C:\Users\username_dim\AppData\Local\Programs\Python\Python36-32\lib\site-packages\thrift\protocol\TBinaryProtocol.py", line 148, in readMessageBegin
    name = self.trans.readAll(sz)
  File "C:\Users\username_dim\AppData\Local\Programs\Python\Python36-32\lib\site-packages\thrift\transport\TTransport.py", line 60, in readAll
    chunk = self.read(sz - have)
  File "C:\Users\username_dim\AppData\Local\Programs\Python\Python36-32\lib\site-packages\thrift\transport\TTransport.py", line 162, in read
    self.__rbuf = BufferIO(self.__trans.read(max(sz, self.__rbuf_size)))
  File "C:\Users\username_dim\AppData\Local\Programs\Python\Python36-32\lib\site-packages\thrift\transport\TSocket.py", line 132, in read
    message='TSocket read 0 bytes')
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes
Process finished with exit code 1
Core-site.xml

我以前配置hiveserver2

<property>
  <name>hadoop.proxyuser.sqoop2.hosts</name>
  <value>*</value>
</property>
<property>
     <name>hadoop.proxyuser.sqoop2.groups</name>
     <value>*</value>
 </property>
<property>
  <name>hadoop.proxyuser.hiveuser_test.hosts</name>
  <value>*</value>
 </property>
<property>
      <name>hadoop.proxyuser.hiveuser_test.groups</name>
      <value>*</value>
</property>
<property>
     <name>hadoop.proxyuser.server.hosts</name>
     <value>*</value>
</property>
    <property>
      <name>hadoop.proxyuser.server.groups</name>
      <value>*</value>
  </property

> Blockquote

你能帮我纠正错误吗

标签: pythonhadoophive

解决方案


cursor.execute("SELECT....") is missing. That's why your thrift connection is not able to read any data thrift.transport.TTransport.TTransportException: TSocket read 0 bytes

The modified code is below.

from pyhive import  hive
import sys
import pandas as pd
import ssl
import thrift_sasl
con=hive.Connection(host="ip_addrs",port=10000,username="hiveuser_test", auth='NOSASL')
cursor = con.cursor()
select_stmt = 'SELECT * FROM t1 LIMIT 10'
cursor.execute(select_stmt)
print(cursor.fetchall())
print(con)

Note: Replace the this line as per your need. select_stmt = 'SELECT * FROM t1 LIMIT 10'


推荐阅读