PyPresto is a client protocol implementation for prestodb. Presto is a distributed SQL query engine for big data.
This client implements asynchronous calls and does basic provisioning for resultsets.
Sorry for the lacking documentation, I'd try to make a more comprehensive version later.
- Python 2.7 or higher
pip install pypresto (DID NOT PUT pip YET)
or, execute from the source directory
python setup.py install
hostnames A list of optional hostnames to connect to, currently a random hostname is used from the list per query
port Port to connect to (default: 8080)
user User name (default: 'nobody')
max_workers Maximum number of workrs to spawn when we're running queries asynchronously (default:6)
catalog Catalog name (default: 'default')
schema Schema name (default: 'hive')
result_mode How to would results be returned, the two options are either 'dict' or 'list' (default: 'dict')
Simple querying:
from pypresto import Client
client = Client(['127.0.0.1'])
with client.connect(catalog='cassandra', schema="myschema") as session:
q = session.query('SELECT * FROM mytable')
for row in q.iter_results():
print('%r' % row)
Using futures:
from pypresto import Client
with client.connect(catalog='cassandra', schema="myschema") as session:
futures = []
for i in range(10):
futures.append(session.query_async('SELECT * FROM mytable where my_int=%d', [i]))
for future in futures:
for row in future.result().iter_results():
print('%r' % row)