impyla hangs when connecting to HiveServer2

2024/9/22 6:53:19

I'm writing some ETL flows in Python that, for part of the process, use Hive. Cloudera's impyla client, according to the documentation, works with both Impala and Hive.

In my experience, the client worked for Impala, but hung when I tried to connect to Hive:

from impala.dbapi import connectconn = connect(host='host_running_hs2_service', port=10000, user='awoolford', password='Bzzzzz')
cursor = conn.cursor()          <- hangs here
cursor.execute('show tables')
results = cursor.fetchall()
print results

If I step-into the code, it hangs when it tries to open a session (line #873 of hiveserver2.py).

At first, I suspected that a firewall port might be blocking the connection, and so I tried to connect using Java. To my surprise, this worked:

public class Main {private static String driverName = "org.apache.hive.jdbc.HiveDriver";public static void main(String[] args) throws SQLException {try {Class.forName(driverName);} catch (ClassNotFoundException e) {e.printStackTrace();System.exit(1);}Connection connection = DriverManager.getConnection("jdbc:hive2://host_running_hs2_service:10000/default", "awoolford", "Bzzzzz");Statement statement = connection.createStatement();ResultSet resultSet = statement.executeQuery("SHOW TABLES");while (resultSet.next()) {System.out.println(resultSet.getString(1));}}
}

Since Hive and Python are such commonly used technologies, I'm curious to know if anyone else has experienced this problem and, if so, what did you do to fix it?

Versions:

  • Hive 1.1.0-cdh5.5.1
  • Python 2.7.11 | Anaconda 2.3.0
  • Redhat 6.7
Answer
/path/to/bin/hive --service hiveserver2 --hiveconf hive.server2.authentication=NOSASLfrom impala.dbapi import connectconn = connect(host='host_running_hs2_service', port=10000, user='awoolford', password='Bzzzzz', auth_mechanism='NOSASL')
cursor = conn.cursor()
cursor.execute('show tables')
results = cursor.fetchall()
print results
https://en.xdnf.cn/q/71975.html

Related Q&A

django prevent delete of model instance

I have a models.Model subclass which represents a View on my mysql database (ie managed=False).However, when running my unit tests, I get:DatabaseError: (1288, The target table my_view_table of the DEL…

suppress/redirect stderr when calling python webrowser

I have a python program that opens several urls in seperate tabs in a new browser window, however when I run the program from the command line and open the browser using webbrowser.open_new(url)The std…

Bokeh logarithmic scale for Bar chart

I know that I can do logarithmic scales with bokeh using the plotting API:p = figure(tools="pan,box_zoom,reset,previewsave",y_axis_type="log", y_range=[0.001, 10**22], title="l…

Can I control the way the CountVectorizer vectorizes the corpus in scikit learn?

I am working with a CountVectorizer from scikit learn, and Im possibly attempting to do some things that the object was not made for...but Im not sure.In terms of getting counts for occurrence:vocabula…

mod_wsgi process getting killed and django stops working

I have mod_wsgi running in daemon mode on a custom Linux build. I havent included any number for processes or threads in the apache config. Here is my config:WSGIDaemonProcess django user=admin WSGIPro…

Reindex 2nd level in incomplete multi-level dataframe to be complete, inserting NANs on missing rows

I need to reindex the 2nd level of a pandas dataframe, so that the 2nd level becomes a (complete) list 0,...,(N-1) for each 1st level index.I tried using Allan/Haydens approach, but unfortunately it on…

ImportError: cannot import name _gdal_array from osgeo

I create a fresh environment, install numpy, then install GDAL. GDAL imports successfully and I can open images using gdal.Open(, but I get the ImportError: cannot import name _gdal_array from osgeo er…

How do I insert a map into DynamoDB table?

I have the following line of code :table.put_item( Item={filename : key, status : {M : iocheckdict }})The iocheckdict looks like this:{A: One, C: Three, D: Four, B: Two, E: Five}So, when I am running t…

How to redirect django.contrib.auth.views.login after login?

I added django.contrib.auth.views.login everywhere in my webpage, for that I had to load a templatetag (that returns the AuthenticationForm) in my base.html. This templatetags includes the registration…

How to do windows API calls in Python 3.1?

Has anyone found a version of pywin32 for python 3.x? The latest available appears to be for 2.6.Alternatively, how would I "roll my own" windows API calls in Python 3.1?