Creating Impala Tables from Pandas Dataframes

Wes Mckinney’s Ibis, a Pythonic interface to Impala, has functionality for creating Impala tables from Python Pandas dataframes.

import pandas as pd
import ibis

hdfs = ibis.hdfs_connect(host=webhdfs_host, port=webhdfs_port)
client = ibis.impala.connect(host=impala_host, port=impala_port,
                             hdfs_client=hdfs)
db = c.database('ibis_testing')
data = pd.DataFrame({'foo': [1, 2, 3, 4], 'bar': ['a', 'b', 'c', 'd']})
db.create_table('pandas_table', data)

This functionality, added in Ibis 0.6.0, is much easier that manually move data to HDFS and loading into Impala.

Back