Python API Reference¶
The Strake Python client provides a high-performance, zero-copy interface to the Strake engine, powered by Rust and Apache Arrow.
class StrakeConnection¶
The primary interface for interacting with Strake. It can operate in Remote mode (connecting to a running server) or Embedded mode (running the engine locally).
__init__¶
__init__(dsn_or_config, sources_config=None, api_key=None)
constructor
Initializes a new connection to the Strake engine.
Parameters:
dsn_or_config:str- Remote mode: A gRPC URL (e.g.,
grpc://localhost:50051).
Embedded mode: A path to astrake.yamlorsources.yamlfile. sources_config:str, optional- Embedded mode only: Specific path to a
sources.yamlfile. If omitted, the engine looks in the same directory asdsn_or_config. api_key:str, optional- Your Strake API key. Required for remote Enterprise servers.
sql¶
sql(query, params=None)
method
Executes a SQL query and returns the results.
Parameters:
query:str- The SQL query string to execute.
params:dict, optional- Query parameters for prepared statements (Experimental).
Returns:
pyarrow.Table- A PyArrow Table containing the query results. Use
.to_pandas()or.to_polars()for downstream analysis.
Example:
import strake
# 1. Connect (Embedded Mode)
conn = strake.StrakeConnection("./config/strake.yaml")
# 2. Query
table = conn.sql("SELECT * FROM demo_pg.public.users LIMIT 10")
# 3. Analyze
df = table.to_pandas()
print(df.head())
describe¶
describe(table_name=None)
method
Introspects the available sources or a specific table schema.
Parameters:
table_name:str, optional- The full name of the table to describe (e.g.,
source.schema.table). If omitted, returns a list of all available tables across all sources.
Returns:
str- A pretty-printed table of metadata.
trace¶
trace(query)
method
Returns the logical execution plan for a query without actually executing it.
Parameters:
query:str- The SQL query string to profile.
Returns:
str- The pretty-printed execution plan showing optimizer transformations and source pushdowns.
Data Interchange¶
Strake is built on Apache Arrow. When you call conn.sql(), data is streamed into Python as Arrow record batches. This ensures:
- Zero-Copy: Near-zero overhead when converting to Pandas or Polars.
- Type Safety: Typed data remains typed from source to dataframe.
- Memory Efficiency: Massive datasets can be processed without inflating memory usage.