Native Asyncio¶
This section covers the native asyncio connection, cursors, and base classes.
Connection¶
- class pyathena.DBAPITypeObject[source]¶
Type Objects and Constructors
https://www.python.org/dev/peps/pep-0249/#type-objects-and-constructors
- pyathena.connect(*args, cursor_class: None = ..., **kwargs) Connection[Cursor][source]¶
- pyathena.connect(*args, cursor_class: type[ConnectionCursor], **kwargs) Connection[ConnectionCursor]
Create a new database connection to Amazon Athena.
This function provides the main entry point for establishing connections to Amazon Athena. It follows the DB API 2.0 specification and returns a Connection object that can be used to create cursors for executing SQL queries.
- Parameters:
s3_staging_dir – S3 location to store query results. Required if not using workgroups or if the workgroup doesn’t have a result location. Pass an empty string to explicitly disable S3 staging and skip the
AWS_ATHENA_S3_STAGING_DIRenvironment variable fallback (required for workgroups with managed query result storage).region_name – AWS region name. If not specified, uses the default region from your AWS configuration.
schema_name – Athena database/schema name. Defaults to “default”.
catalog_name – Athena data catalog name. Defaults to “awsdatacatalog”.
work_group – Athena workgroup name. Can be used instead of s3_staging_dir if the workgroup has a result location configured.
poll_interval – Time in seconds between polling for query completion. Defaults to 1.0.
encryption_option – S3 encryption option for query results. Can be “SSE_S3”, “SSE_KMS”, or “CSE_KMS”.
kms_key – KMS key ID for encryption when using SSE_KMS or CSE_KMS.
profile_name – AWS profile name to use for authentication.
role_arn – ARN of IAM role to assume for authentication.
role_session_name – Session name when assuming a role.
cursor_class – Custom cursor class to use. If not specified, uses the default Cursor class.
kill_on_interrupt – Whether to cancel running queries when interrupted. Defaults to True.
**kwargs – Additional keyword arguments passed to the Connection constructor.
- Returns:
A Connection object that can be used to create cursors and execute queries.
- Raises:
ProgrammingError – If neither s3_staging_dir nor work_group is provided.
Example
>>> import pyathena >>> conn = pyathena.connect( ... s3_staging_dir='s3://my-bucket/staging/', ... region_name='us-east-1', ... schema_name='mydatabase' ... ) >>> cursor = conn.cursor() >>> cursor.execute("SELECT * FROM mytable LIMIT 10") >>> results = cursor.fetchall()
- async pyathena.aio_connect(*args, **kwargs) AioConnection[source]¶
Create a new async database connection to Amazon Athena.
This is the async counterpart of
connect(). It returns anAioConnectionwhose cursors use nativeasynciofor polling and API calls, keeping the event loop free.- Parameters:
**kwargs – Arguments forwarded to
AioConnection.create(). Seeconnect()for the full list of supported arguments.- Returns:
An
AioConnectionthat producesAioCursorinstances by default.
Example
>>> import pyathena >>> conn = await pyathena.aio_connect( ... s3_staging_dir='s3://my-bucket/staging/', ... region_name='us-east-1', ... ) >>> async with conn.cursor() as cursor: ... await cursor.execute("SELECT 1") ... print(await cursor.fetchone())
- class pyathena.aio.connection.AioConnection(**kwargs: Any)[source]¶
Async-aware connection to Amazon Athena.
Wraps the synchronous
Connectionwith async context manager support and providescreate()for non-blocking initialization.Example
>>> async with await AioConnection.create( ... s3_staging_dir="s3://bucket/path/", ... region_name="us-east-1", ... ) as conn: ... async with conn.cursor() as cursor: ... await cursor.execute("SELECT 1") ... print(await cursor.fetchone())
- __init__(**kwargs: Any) None[source]¶
Initialize a new Athena database connection.
- Parameters:
s3_staging_dir – S3 location to store query results. Required if not using workgroups or if workgroup doesn’t have result location. Pass an empty string to explicitly disable S3 staging and skip the
AWS_ATHENA_S3_STAGING_DIRenvironment variable fallback. This is required when connecting to a workgroup with managed query result storage enabled.region_name – AWS region name. Uses default region if not specified.
schema_name – Default database/schema name. Defaults to “default”.
catalog_name – Data catalog name. Defaults to “awsdatacatalog”.
work_group – Athena workgroup name. Can substitute for s3_staging_dir if workgroup has result location configured.
poll_interval – Seconds between query status polls. Defaults to 1.0.
encryption_option – S3 encryption for results (“SSE_S3”, “SSE_KMS”, “CSE_KMS”).
kms_key – KMS key ID when using SSE_KMS or CSE_KMS encryption.
profile_name – AWS profile name for authentication.
role_arn – IAM role ARN to assume for authentication.
role_session_name – Session name when assuming IAM role.
external_id – External ID for role assumption (if required by role).
serial_number – MFA device serial number for role assumption.
duration_seconds – Role session duration in seconds. Defaults to 3600.
converter – Custom type converter. Uses DefaultTypeConverter if None.
formatter – Custom parameter formatter. Uses DefaultParameterFormatter if None.
retry_config – Retry configuration for API calls. Uses default if None.
cursor_class – Default cursor class for this connection.
cursor_kwargs – Default keyword arguments for cursor creation.
kill_on_interrupt – Cancel running queries on interrupt. Defaults to True.
session – Pre-configured boto3 Session. Creates new session if None.
config – Boto3 Config object for client configuration.
result_reuse_enable – Enable Athena query result reuse. Defaults to False.
result_reuse_minutes – Minutes to reuse cached results.
on_start_query_execution – Callback function called when query starts.
**kwargs – Additional arguments passed to boto3 Session and client.
- Raises:
ProgrammingError – If neither s3_staging_dir nor work_group is provided.
Note
Either s3_staging_dir or work_group must be specified. Environment variables AWS_ATHENA_S3_STAGING_DIR and AWS_ATHENA_WORK_GROUP are checked if parameters are not provided.
When using a workgroup with managed query result storage, pass
s3_staging_dir=""to prevent the environment variable fallback from sending aResultConfigurationthat conflicts withManagedQueryResultsConfiguration.
- async classmethod create(**kwargs: Any) AioConnection[source]¶
Async factory for creating an
AioConnection.Runs the (potentially blocking)
__init__in a thread so that STS calls (role_arn/serial_number) do not block the loop.- Parameters:
**kwargs – Arguments forwarded to
AioConnection.__init__.- Returns:
A fully initialized
AioConnection.
- __enter__()¶
Enter the runtime context for the connection.
- Returns:
Self for use in context manager protocol.
- __exit__(exc_type, exc_val, exc_tb)¶
Exit the runtime context and close the connection.
- Parameters:
exc_type – Exception type if an exception occurred.
exc_val – Exception value if an exception occurred.
exc_tb – Exception traceback if an exception occurred.
- property client: BaseClient¶
Get the boto3 Athena client used for query operations.
- Returns:
The configured boto3 Athena client.
- close() None¶
Close the connection.
Closes the database connection. This method is provided for DB API 2.0 compatibility. Since Athena connections are stateless, this method currently does not perform any actual cleanup operations.
Note
This method is called automatically when using the connection as a context manager (with statement).
- commit() None¶
Commit any pending transaction.
This method is provided for DB API 2.0 compatibility. Since Athena does not support transactions, this method does nothing.
Note
Athena queries are auto-committed and cannot be rolled back.
- cursor(cursor: type[FunctionalCursor] | None = None, **kwargs) FunctionalCursor | ConnectionCursor¶
Create a new cursor object for executing queries.
Creates and returns a cursor object that can be used to execute SQL queries against Amazon Athena. The cursor inherits connection settings but can be customized with additional parameters.
- Parameters:
cursor – Custom cursor class to use. If not provided, uses the connection’s default cursor class.
**kwargs – Additional keyword arguments to pass to the cursor constructor. These override connection defaults.
- Returns:
A cursor object that can execute SQL queries.
Example
>>> cursor = connection.cursor() >>> cursor.execute("SELECT * FROM my_table LIMIT 10") >>> results = cursor.fetchall()
# Using a custom cursor type >>> from pyathena.pandas.cursor import PandasCursor >>> pandas_cursor = connection.cursor(PandasCursor) >>> df = pandas_cursor.execute(“SELECT * FROM my_table”).fetchall()
- property retry_config: RetryConfig¶
Get the retry configuration for AWS API calls.
- Returns:
The RetryConfig object that controls retry behavior for failed requests.
- rollback() None¶
Rollback any pending transaction.
This method is required by DB API 2.0 but is not supported by Athena since Athena does not support transactions.
- Raises:
NotSupportedError – Always raised since transactions are not supported.
- property session: Session¶
Get the boto3 session used for AWS API calls.
- Returns:
The configured boto3 Session object.
Aio Cursors¶
- class pyathena.aio.cursor.AioCursor(s3_staging_dir: str | None = None, schema_name: str | None = None, catalog_name: str | None = None, work_group: str | None = None, poll_interval: float = 1, encryption_option: str | None = None, kms_key: str | None = None, kill_on_interrupt: bool = True, result_reuse_enable: bool = False, result_reuse_minutes: int = 60, **kwargs)[source]¶
Native asyncio cursor for Amazon Athena.
Unlike
AsyncCursor(which usesThreadPoolExecutor), this cursor usesasyncio.sleepfor polling andasyncio.to_threadfor boto3 calls, keeping the event loop free.Example
>>> async with AioConnection.create(...) as conn: ... async with conn.cursor() as cursor: ... await cursor.execute("SELECT * FROM my_table") ... rows = await cursor.fetchall()
- __init__(s3_staging_dir: str | None = None, schema_name: str | None = None, catalog_name: str | None = None, work_group: str | None = None, poll_interval: float = 1, encryption_option: str | None = None, kms_key: str | None = None, kill_on_interrupt: bool = True, result_reuse_enable: bool = False, result_reuse_minutes: int = 60, **kwargs) None[source]¶
- async execute(operation: str, parameters: dict[str, Any] | list[str] | None = None, work_group: str | None = None, s3_staging_dir: str | None = None, cache_size: int = 0, cache_expiration_time: int = 0, result_reuse_enable: bool | None = None, result_reuse_minutes: int | None = None, paramstyle: str | None = None, result_set_type_hints: dict[str | int, str] | None = None, **kwargs) AioCursor[source]¶
Execute a SQL query asynchronously.
- Parameters:
operation – SQL query string to execute.
parameters – Query parameters (optional).
work_group – Athena workgroup to use (optional).
s3_staging_dir – S3 location for query results (optional).
cache_size – Query result cache size (optional).
cache_expiration_time – Cache expiration time in seconds (optional).
result_reuse_enable – Enable result reuse (optional).
result_reuse_minutes – Result reuse duration in minutes (optional).
paramstyle – Parameter style to use (optional).
result_set_type_hints – Optional dictionary mapping column names to Athena DDL type signatures for precise type conversion within complex types.
**kwargs – Additional execution parameters.
- Returns:
Self reference for method chaining.
- async fetchone() Any | dict[Any, Any | None] | None[source]¶
Fetch the next row of a query result set.
- Returns:
A tuple representing the next row, or None if no more rows.
- Raises:
ProgrammingError – If called before executing a query that returns results.
- async fetchmany(size: int | None = None) list[Any | dict[Any, Any | None]][source]¶
Fetch multiple rows from a query result set.
- Parameters:
size – Maximum number of rows to fetch. If None, uses arraysize.
- Returns:
List of tuples representing the fetched rows.
- Raises:
ProgrammingError – If called before executing a query that returns results.
- async fetchall() list[Any | dict[Any, Any | None]][source]¶
Fetch all remaining rows from a query result set.
- Returns:
List of tuples representing all remaining rows in the result set.
- Raises:
ProgrammingError – If called before executing a query that returns results.
- DEFAULT_RESULT_REUSE_MINUTES = 60¶
- LIST_DATABASES_MAX_RESULTS = 50¶
- LIST_QUERY_EXECUTIONS_MAX_RESULTS = 50¶
- LIST_TABLE_METADATA_MAX_RESULTS = 50¶
- async cancel() None¶
Cancel the currently executing query.
- Raises:
ProgrammingError – If no query is currently executing.
- property connection: Connection[Any]¶
- async executemany(operation: str, seq_of_parameters: list[dict[str, Any] | list[str] | None], **kwargs) None¶
Execute a SQL query multiple times with different parameters.
- Parameters:
operation – SQL query string to execute.
seq_of_parameters – Sequence of parameter sets, one per execution.
**kwargs – Additional keyword arguments passed to each
execute().
- static get_default_converter(unload: bool = False) DefaultTypeConverter | Any¶
Get the default type converter for this cursor class.
- Parameters:
unload – Whether the converter is for UNLOAD operations. Some cursor types may return different converters for UNLOAD operations.
- Returns:
The default type converter instance for this cursor type.
- async get_table_metadata(table_name: str, catalog_name: str | None = None, schema_name: str | None = None, logging_: bool = True) AthenaTableMetadata¶
- async list_databases(catalog_name: str | None, max_results: int | None = None) list[AthenaDatabase]¶
- async list_table_metadata(catalog_name: str | None = None, schema_name: str | None = None, expression: str | None = None, max_results: int | None = None) list[AthenaTableMetadata]¶
- property result_set: AthenaResultSet | None¶
- property rowcount: int¶
Get the number of rows affected by the last operation.
For SELECT statements, this returns -1 as per DB API 2.0 specification. For DML operations (INSERT, UPDATE, DELETE) and CTAS, this returns the number of affected rows.
- Returns:
The number of rows, or -1 if not applicable or unknown.
- setinputsizes(sizes)¶
Does nothing by default
- setoutputsize(size, column=None)¶
Does nothing by default
- class pyathena.aio.cursor.AioDictCursor(**kwargs)[source]¶
Native asyncio cursor that returns rows as dictionaries.
Example
>>> async with AioConnection.create(...) as conn: ... cursor = conn.cursor(AioDictCursor) ... await cursor.execute("SELECT id, name FROM users") ... row = await cursor.fetchone() ... print(row["name"])
- DEFAULT_RESULT_REUSE_MINUTES = 60¶
- LIST_DATABASES_MAX_RESULTS = 50¶
- LIST_QUERY_EXECUTIONS_MAX_RESULTS = 50¶
- LIST_TABLE_METADATA_MAX_RESULTS = 50¶
- async cancel() None¶
Cancel the currently executing query.
- Raises:
ProgrammingError – If no query is currently executing.
- property connection: Connection[Any]¶
- async execute(operation: str, parameters: dict[str, Any] | list[str] | None = None, work_group: str | None = None, s3_staging_dir: str | None = None, cache_size: int = 0, cache_expiration_time: int = 0, result_reuse_enable: bool | None = None, result_reuse_minutes: int | None = None, paramstyle: str | None = None, result_set_type_hints: dict[str | int, str] | None = None, **kwargs) AioCursor¶
Execute a SQL query asynchronously.
- Parameters:
operation – SQL query string to execute.
parameters – Query parameters (optional).
work_group – Athena workgroup to use (optional).
s3_staging_dir – S3 location for query results (optional).
cache_size – Query result cache size (optional).
cache_expiration_time – Cache expiration time in seconds (optional).
result_reuse_enable – Enable result reuse (optional).
result_reuse_minutes – Result reuse duration in minutes (optional).
paramstyle – Parameter style to use (optional).
result_set_type_hints – Optional dictionary mapping column names to Athena DDL type signatures for precise type conversion within complex types.
**kwargs – Additional execution parameters.
- Returns:
Self reference for method chaining.
- async executemany(operation: str, seq_of_parameters: list[dict[str, Any] | list[str] | None], **kwargs) None¶
Execute a SQL query multiple times with different parameters.
- Parameters:
operation – SQL query string to execute.
seq_of_parameters – Sequence of parameter sets, one per execution.
**kwargs – Additional keyword arguments passed to each
execute().
- async fetchall() list[Any | dict[Any, Any | None]]¶
Fetch all remaining rows from a query result set.
- Returns:
List of tuples representing all remaining rows in the result set.
- Raises:
ProgrammingError – If called before executing a query that returns results.
- async fetchmany(size: int | None = None) list[Any | dict[Any, Any | None]]¶
Fetch multiple rows from a query result set.
- Parameters:
size – Maximum number of rows to fetch. If None, uses arraysize.
- Returns:
List of tuples representing the fetched rows.
- Raises:
ProgrammingError – If called before executing a query that returns results.
- async fetchone() Any | dict[Any, Any | None] | None¶
Fetch the next row of a query result set.
- Returns:
A tuple representing the next row, or None if no more rows.
- Raises:
ProgrammingError – If called before executing a query that returns results.
- static get_default_converter(unload: bool = False) DefaultTypeConverter | Any¶
Get the default type converter for this cursor class.
- Parameters:
unload – Whether the converter is for UNLOAD operations. Some cursor types may return different converters for UNLOAD operations.
- Returns:
The default type converter instance for this cursor type.
- async get_table_metadata(table_name: str, catalog_name: str | None = None, schema_name: str | None = None, logging_: bool = True) AthenaTableMetadata¶
- async list_databases(catalog_name: str | None, max_results: int | None = None) list[AthenaDatabase]¶
- async list_table_metadata(catalog_name: str | None = None, schema_name: str | None = None, expression: str | None = None, max_results: int | None = None) list[AthenaTableMetadata]¶
- property result_set: AthenaResultSet | None¶
- property rowcount: int¶
Get the number of rows affected by the last operation.
For SELECT statements, this returns -1 as per DB API 2.0 specification. For DML operations (INSERT, UPDATE, DELETE) and CTAS, this returns the number of affected rows.
- Returns:
The number of rows, or -1 if not applicable or unknown.
- setinputsizes(sizes)¶
Does nothing by default
- setoutputsize(size, column=None)¶
Does nothing by default
Aio Result Set¶
- class pyathena.aio.result_set.AthenaAioResultSet(connection: Connection[Any], converter: Converter, query_execution: AthenaQueryExecution, arraysize: int, retry_config: RetryConfig, result_set_type_hints: dict[str | int, str] | None = None)[source]¶
Async result set that provides async fetch methods.
Skips the synchronous
_pre_fetchby passing_pre_fetch=Falseto the parent__init__and provides anasync create()classmethod factory instead.- __init__(connection: Connection[Any], converter: Converter, query_execution: AthenaQueryExecution, arraysize: int, retry_config: RetryConfig, result_set_type_hints: dict[str | int, str] | None = None) None[source]¶
- async classmethod create(connection: Connection[Any], converter: Converter, query_execution: AthenaQueryExecution, arraysize: int, retry_config: RetryConfig, result_set_type_hints: dict[str | int, str] | None = None) AthenaAioResultSet[source]¶
Async factory method.
Creates an
AthenaAioResultSetand awaits the initial data fetch.- Parameters:
connection – The database connection.
converter – Type converter for result values.
query_execution – Query execution metadata.
arraysize – Number of rows to fetch per request.
retry_config – Retry configuration for API calls.
result_set_type_hints – Optional dictionary mapping column names to Athena DDL type signatures for precise type conversion.
- Returns:
A fully initialized
AthenaAioResultSet.
- async fetchone() tuple[Any | None, ...] | dict[Any, Any | None] | None[source]¶
Fetch the next row of the result set.
Automatically fetches the next page from Athena when the current page is exhausted and more pages are available.
- Returns:
A tuple representing the next row, or None if no more rows.
- async fetchmany(size: int | None = None) list[tuple[Any | None, ...] | dict[Any, Any | None]][source]¶
Fetch multiple rows from the result set.
- Parameters:
size – Maximum number of rows to fetch. If None, uses arraysize.
- Returns:
List of row tuples. May contain fewer rows than requested if fewer are available.
- async fetchall() list[tuple[Any | None, ...] | dict[Any, Any | None]][source]¶
Fetch all remaining rows from the result set.
- Returns:
List of all remaining row tuples.
- DEFAULT_RESULT_REUSE_MINUTES = 60¶
- property connection: Connection[Any]¶
- class pyathena.aio.result_set.AthenaAioDictResultSet(connection: Connection[Any], converter: Converter, query_execution: AthenaQueryExecution, arraysize: int, retry_config: RetryConfig, result_set_type_hints: dict[str | int, str] | None = None)[source]¶
Async result set that returns rows as dictionaries.
Inherits
_get_rowsfromAthenaDictResultSetand async fetch methods fromAthenaAioResultSetvia multiple inheritance.- DEFAULT_RESULT_REUSE_MINUTES = 60¶
- __init__(connection: Connection[Any], converter: Converter, query_execution: AthenaQueryExecution, arraysize: int, retry_config: RetryConfig, result_set_type_hints: dict[str | int, str] | None = None) None¶
- property connection: Connection[Any]¶
- async classmethod create(connection: Connection[Any], converter: Converter, query_execution: AthenaQueryExecution, arraysize: int, retry_config: RetryConfig, result_set_type_hints: dict[str | int, str] | None = None) AthenaAioResultSet¶
Async factory method.
Creates an
AthenaAioResultSetand awaits the initial data fetch.- Parameters:
connection – The database connection.
converter – Type converter for result values.
query_execution – Query execution metadata.
arraysize – Number of rows to fetch per request.
retry_config – Retry configuration for API calls.
result_set_type_hints – Optional dictionary mapping column names to Athena DDL type signatures for precise type conversion.
- Returns:
A fully initialized
AthenaAioResultSet.
- async fetchall() list[tuple[Any | None, ...] | dict[Any, Any | None]]¶
Fetch all remaining rows from the result set.
- Returns:
List of all remaining row tuples.
- async fetchmany(size: int | None = None) list[tuple[Any | None, ...] | dict[Any, Any | None]]¶
Fetch multiple rows from the result set.
- Parameters:
size – Maximum number of rows to fetch. If None, uses arraysize.
- Returns:
List of row tuples. May contain fewer rows than requested if fewer are available.
- async fetchone() tuple[Any | None, ...] | dict[Any, Any | None] | None¶
Fetch the next row of the result set.
Automatically fetches the next page from Athena when the current page is exhausted and more pages are available.
- Returns:
A tuple representing the next row, or None if no more rows.
Aio Base Classes¶
- class pyathena.aio.common.AioBaseCursor(connection: Connection[Any], converter: Converter, formatter: Formatter, retry_config: RetryConfig, s3_staging_dir: str | None, schema_name: str | None, catalog_name: str | None, work_group: str | None, poll_interval: float, encryption_option: str | None, kms_key: str | None, kill_on_interrupt: bool, result_reuse_enable: bool, result_reuse_minutes: int, **kwargs)[source]¶
Async base cursor that overrides I/O methods with async equivalents.
Reuses
BaseCursor.__init__, all_build_*methods, and constants. Only the methods that perform network I/O or blocking sleep are overridden to useasyncio.to_thread/asyncio.sleep.- async list_databases(catalog_name: str | None, max_results: int | None = None) list[AthenaDatabase][source]¶
- async get_table_metadata(table_name: str, catalog_name: str | None = None, schema_name: str | None = None, logging_: bool = True) AthenaTableMetadata[source]¶
- async list_table_metadata(catalog_name: str | None = None, schema_name: str | None = None, expression: str | None = None, max_results: int | None = None) list[AthenaTableMetadata][source]¶
- LIST_DATABASES_MAX_RESULTS = 50¶
- LIST_QUERY_EXECUTIONS_MAX_RESULTS = 50¶
- LIST_TABLE_METADATA_MAX_RESULTS = 50¶
- __init__(connection: Connection[Any], converter: Converter, formatter: Formatter, retry_config: RetryConfig, s3_staging_dir: str | None, schema_name: str | None, catalog_name: str | None, work_group: str | None, poll_interval: float, encryption_option: str | None, kms_key: str | None, kill_on_interrupt: bool, result_reuse_enable: bool, result_reuse_minutes: int, **kwargs) None¶
- property connection: Connection[Any]¶
- abstractmethod execute(operation: str, parameters: dict[str, Any] | list[str] | None = None, **kwargs)¶
- abstractmethod executemany(operation: str, seq_of_parameters: list[dict[str, Any] | list[str] | None], **kwargs) None¶
- static get_default_converter(unload: bool = False) DefaultTypeConverter | Any¶
Get the default type converter for this cursor class.
- Parameters:
unload – Whether the converter is for UNLOAD operations. Some cursor types may return different converters for UNLOAD operations.
- Returns:
The default type converter instance for this cursor type.
- setinputsizes(sizes)¶
Does nothing by default
- setoutputsize(size, column=None)¶
Does nothing by default
- class pyathena.aio.common.WithAsyncFetch(**kwargs)[source]¶
Mixin providing shared fetch, lifecycle, and async protocol for SQL cursors.
Provides properties (
arraysize,result_set,query_id,rownumber,rowcount), lifecycle methods (close,executemany,cancel), default sync fetch (for cursors whose result sets load all data eagerly in__init__), and the async iteration protocol.Subclasses override
execute()and optionally__init__and format-specific helpers.- property result_set: AthenaResultSet | None¶
- property rowcount: int¶
Get the number of rows affected by the last operation.
For SELECT statements, this returns -1 as per DB API 2.0 specification. For DML operations (INSERT, UPDATE, DELETE) and CTAS, this returns the number of affected rows.
- Returns:
The number of rows, or -1 if not applicable or unknown.
- async executemany(operation: str, seq_of_parameters: list[dict[str, Any] | list[str] | None], **kwargs) None[source]¶
Execute a SQL query multiple times with different parameters.
- Parameters:
operation – SQL query string to execute.
seq_of_parameters – Sequence of parameter sets, one per execution.
**kwargs – Additional keyword arguments passed to each
execute().
- async cancel() None[source]¶
Cancel the currently executing query.
- Raises:
ProgrammingError – If no query is currently executing.
- fetchone() tuple[Any | None, ...] | dict[Any, Any | None] | None[source]¶
Fetch the next row of the result set.
- Returns:
A tuple representing the next row, or None if no more rows.
- Raises:
ProgrammingError – If no result set is available.
- fetchmany(size: int | None = None) list[tuple[Any | None, ...] | dict[Any, Any | None]][source]¶
Fetch multiple rows from the result set.
- Parameters:
size – Maximum number of rows to fetch. Defaults to arraysize.
- Returns:
List of tuples representing the fetched rows.
- Raises:
ProgrammingError – If no result set is available.
- fetchall() list[tuple[Any | None, ...] | dict[Any, Any | None]][source]¶
Fetch all remaining rows from the result set.
- Returns:
List of tuples representing all remaining rows.
- Raises:
ProgrammingError – If no result set is available.
- DEFAULT_RESULT_REUSE_MINUTES = 60¶
- LIST_DATABASES_MAX_RESULTS = 50¶
- LIST_QUERY_EXECUTIONS_MAX_RESULTS = 50¶
- LIST_TABLE_METADATA_MAX_RESULTS = 50¶
- property connection: Connection[Any]¶
- abstractmethod execute(operation: str, parameters: dict[str, Any] | list[str] | None = None, **kwargs)¶
- static get_default_converter(unload: bool = False) DefaultTypeConverter | Any¶
Get the default type converter for this cursor class.
- Parameters:
unload – Whether the converter is for UNLOAD operations. Some cursor types may return different converters for UNLOAD operations.
- Returns:
The default type converter instance for this cursor type.
- async get_table_metadata(table_name: str, catalog_name: str | None = None, schema_name: str | None = None, logging_: bool = True) AthenaTableMetadata¶
- async list_databases(catalog_name: str | None, max_results: int | None = None) list[AthenaDatabase]¶
- async list_table_metadata(catalog_name: str | None = None, schema_name: str | None = None, expression: str | None = None, max_results: int | None = None) list[AthenaTableMetadata]¶
- setinputsizes(sizes)¶
Does nothing by default
- setoutputsize(size, column=None)¶
Does nothing by default
Aio Pandas Cursor¶
- class pyathena.aio.pandas.cursor.AioPandasCursor(s3_staging_dir: str | None = None, schema_name: str | None = None, catalog_name: str | None = None, work_group: str | None = None, poll_interval: float = 1, encryption_option: str | None = None, kms_key: str | None = None, kill_on_interrupt: bool = True, unload: bool = False, engine: str = 'auto', chunksize: int | None = None, block_size: int | None = None, cache_type: str | None = None, max_workers: int = 20, result_reuse_enable: bool = False, result_reuse_minutes: int = 60, auto_optimize_chunksize: bool = False, **kwargs)[source]¶
Native asyncio cursor that returns results as pandas DataFrames.
Uses
asyncio.to_thread()for both result set creation and fetch operations, keeping the event loop free. This is especially important whenchunksizeis set, as fetch calls trigger lazy S3 reads.Example
>>> async with await pyathena.aio_connect(...) as conn: ... cursor = conn.cursor(AioPandasCursor) ... await cursor.execute("SELECT * FROM my_table") ... df = cursor.as_pandas()
- __init__(s3_staging_dir: str | None = None, schema_name: str | None = None, catalog_name: str | None = None, work_group: str | None = None, poll_interval: float = 1, encryption_option: str | None = None, kms_key: str | None = None, kill_on_interrupt: bool = True, unload: bool = False, engine: str = 'auto', chunksize: int | None = None, block_size: int | None = None, cache_type: str | None = None, max_workers: int = 20, result_reuse_enable: bool = False, result_reuse_minutes: int = 60, auto_optimize_chunksize: bool = False, **kwargs) None[source]¶
- static get_default_converter(unload: bool = False) DefaultPandasTypeConverter | Any[source]¶
Get the default type converter for this cursor class.
- Parameters:
unload – Whether the converter is for UNLOAD operations. Some cursor types may return different converters for UNLOAD operations.
- Returns:
The default type converter instance for this cursor type.
- async execute(operation: str, parameters: dict[str, Any] | list[str] | None = None, work_group: str | None = None, s3_staging_dir: str | None = None, cache_size: int | None = 0, cache_expiration_time: int | None = 0, result_reuse_enable: bool | None = None, result_reuse_minutes: int | None = None, paramstyle: str | None = None, keep_default_na: bool = False, na_values: Iterable[str] | None = ('',), quoting: int = 1, **kwargs) AioPandasCursor[source]¶
Execute a SQL query asynchronously and return results as pandas DataFrames.
- Parameters:
operation – SQL query string to execute.
parameters – Query parameters for parameterized queries.
work_group – Athena workgroup to use for this query.
s3_staging_dir – S3 location for query results.
cache_size – Number of queries to check for result caching.
cache_expiration_time – Cache expiration time in seconds.
result_reuse_enable – Enable Athena result reuse for this query.
result_reuse_minutes – Minutes to reuse cached results.
paramstyle – Parameter style (‘qmark’ or ‘pyformat’).
keep_default_na – Whether to keep default pandas NA values.
na_values – Additional values to treat as NA.
quoting – CSV quoting behavior (pandas csv.QUOTE_* constants).
**kwargs – Additional pandas read_csv/read_parquet parameters.
- Returns:
Self reference for method chaining.
- async fetchone() tuple[Any | None, ...] | dict[Any, Any | None] | None[source]¶
Fetch the next row of the result set.
Wraps the synchronous fetch in
asyncio.to_threadto avoid blocking the event loop whenchunksizetriggers lazy S3 reads.- Returns:
A tuple representing the next row, or None if no more rows.
- Raises:
ProgrammingError – If no result set is available.
- async fetchmany(size: int | None = None) list[tuple[Any | None, ...] | dict[Any, Any | None]][source]¶
Fetch multiple rows from the result set.
Wraps the synchronous fetch in
asyncio.to_threadto avoid blocking the event loop whenchunksizetriggers lazy S3 reads.- Parameters:
size – Maximum number of rows to fetch. Defaults to arraysize.
- Returns:
List of tuples representing the fetched rows.
- Raises:
ProgrammingError – If no result set is available.
- async fetchall() list[tuple[Any | None, ...] | dict[Any, Any | None]][source]¶
Fetch all remaining rows from the result set.
Wraps the synchronous fetch in
asyncio.to_threadto avoid blocking the event loop whenchunksizetriggers lazy S3 reads.- Returns:
List of tuples representing all remaining rows.
- Raises:
ProgrammingError – If no result set is available.
- as_pandas() DataFrame | PandasDataFrameIterator[source]¶
Return DataFrame or PandasDataFrameIterator based on chunksize setting.
- Returns:
DataFrame when chunksize is None, PandasDataFrameIterator when chunksize is set.
- DEFAULT_RESULT_REUSE_MINUTES = 60¶
- LIST_DATABASES_MAX_RESULTS = 50¶
- LIST_QUERY_EXECUTIONS_MAX_RESULTS = 50¶
- LIST_TABLE_METADATA_MAX_RESULTS = 50¶
- async cancel() None¶
Cancel the currently executing query.
- Raises:
ProgrammingError – If no query is currently executing.
- property connection: Connection[Any]¶
- async executemany(operation: str, seq_of_parameters: list[dict[str, Any] | list[str] | None], **kwargs) None¶
Execute a SQL query multiple times with different parameters.
- Parameters:
operation – SQL query string to execute.
seq_of_parameters – Sequence of parameter sets, one per execution.
**kwargs – Additional keyword arguments passed to each
execute().
- async get_table_metadata(table_name: str, catalog_name: str | None = None, schema_name: str | None = None, logging_: bool = True) AthenaTableMetadata¶
- async list_databases(catalog_name: str | None, max_results: int | None = None) list[AthenaDatabase]¶
- async list_table_metadata(catalog_name: str | None = None, schema_name: str | None = None, expression: str | None = None, max_results: int | None = None) list[AthenaTableMetadata]¶
- property result_set: AthenaResultSet | None¶
- property rowcount: int¶
Get the number of rows affected by the last operation.
For SELECT statements, this returns -1 as per DB API 2.0 specification. For DML operations (INSERT, UPDATE, DELETE) and CTAS, this returns the number of affected rows.
- Returns:
The number of rows, or -1 if not applicable or unknown.
- setinputsizes(sizes)¶
Does nothing by default
- setoutputsize(size, column=None)¶
Does nothing by default
Aio Arrow Cursor¶
- class pyathena.aio.arrow.cursor.AioArrowCursor(s3_staging_dir: str | None = None, schema_name: str | None = None, catalog_name: str | None = None, work_group: str | None = None, poll_interval: float = 1, encryption_option: str | None = None, kms_key: str | None = None, kill_on_interrupt: bool = True, unload: bool = False, result_reuse_enable: bool = False, result_reuse_minutes: int = 60, connect_timeout: float | None = None, request_timeout: float | None = None, **kwargs)[source]¶
Native asyncio cursor that returns results as Apache Arrow Tables.
Uses
asyncio.to_thread()for both result set creation and fetch operations, keeping the event loop free.Example
>>> async with await pyathena.aio_connect(...) as conn: ... cursor = conn.cursor(AioArrowCursor) ... await cursor.execute("SELECT * FROM my_table") ... table = cursor.as_arrow()
- __init__(s3_staging_dir: str | None = None, schema_name: str | None = None, catalog_name: str | None = None, work_group: str | None = None, poll_interval: float = 1, encryption_option: str | None = None, kms_key: str | None = None, kill_on_interrupt: bool = True, unload: bool = False, result_reuse_enable: bool = False, result_reuse_minutes: int = 60, connect_timeout: float | None = None, request_timeout: float | None = None, **kwargs) None[source]¶
- static get_default_converter(unload: bool = False) DefaultArrowTypeConverter | DefaultArrowUnloadTypeConverter | Any[source]¶
Get the default type converter for this cursor class.
- Parameters:
unload – Whether the converter is for UNLOAD operations. Some cursor types may return different converters for UNLOAD operations.
- Returns:
The default type converter instance for this cursor type.
- async execute(operation: str, parameters: dict[str, Any] | list[str] | None = None, work_group: str | None = None, s3_staging_dir: str | None = None, cache_size: int | None = 0, cache_expiration_time: int | None = 0, result_reuse_enable: bool | None = None, result_reuse_minutes: int | None = None, paramstyle: str | None = None, **kwargs) AioArrowCursor[source]¶
Execute a SQL query asynchronously and return results as Arrow Tables.
- Parameters:
operation – SQL query string to execute.
parameters – Query parameters for parameterized queries.
work_group – Athena workgroup to use for this query.
s3_staging_dir – S3 location for query results.
cache_size – Number of queries to check for result caching.
cache_expiration_time – Cache expiration time in seconds.
result_reuse_enable – Enable Athena result reuse for this query.
result_reuse_minutes – Minutes to reuse cached results.
paramstyle – Parameter style (‘qmark’ or ‘pyformat’).
**kwargs – Additional execution parameters.
- Returns:
Self reference for method chaining.
- async fetchone() tuple[Any | None, ...] | dict[Any, Any | None] | None[source]¶
Fetch the next row of the result set.
Wraps the synchronous fetch in
asyncio.to_threadto avoid blocking the event loop.- Returns:
A tuple representing the next row, or None if no more rows.
- Raises:
ProgrammingError – If no result set is available.
- async fetchmany(size: int | None = None) list[tuple[Any | None, ...] | dict[Any, Any | None]][source]¶
Fetch multiple rows from the result set.
Wraps the synchronous fetch in
asyncio.to_threadto avoid blocking the event loop.- Parameters:
size – Maximum number of rows to fetch. Defaults to arraysize.
- Returns:
List of tuples representing the fetched rows.
- Raises:
ProgrammingError – If no result set is available.
- async fetchall() list[tuple[Any | None, ...] | dict[Any, Any | None]][source]¶
Fetch all remaining rows from the result set.
Wraps the synchronous fetch in
asyncio.to_threadto avoid blocking the event loop.- Returns:
List of tuples representing all remaining rows.
- Raises:
ProgrammingError – If no result set is available.
- as_arrow() Table[source]¶
Return query results as an Apache Arrow Table.
- Returns:
Apache Arrow Table containing all query results.
- as_polars() pl.DataFrame[source]¶
Return query results as a Polars DataFrame.
- Returns:
Polars DataFrame containing all query results.
- DEFAULT_RESULT_REUSE_MINUTES = 60¶
- LIST_DATABASES_MAX_RESULTS = 50¶
- LIST_QUERY_EXECUTIONS_MAX_RESULTS = 50¶
- LIST_TABLE_METADATA_MAX_RESULTS = 50¶
- async cancel() None¶
Cancel the currently executing query.
- Raises:
ProgrammingError – If no query is currently executing.
- property connection: Connection[Any]¶
- async executemany(operation: str, seq_of_parameters: list[dict[str, Any] | list[str] | None], **kwargs) None¶
Execute a SQL query multiple times with different parameters.
- Parameters:
operation – SQL query string to execute.
seq_of_parameters – Sequence of parameter sets, one per execution.
**kwargs – Additional keyword arguments passed to each
execute().
- async get_table_metadata(table_name: str, catalog_name: str | None = None, schema_name: str | None = None, logging_: bool = True) AthenaTableMetadata¶
- async list_databases(catalog_name: str | None, max_results: int | None = None) list[AthenaDatabase]¶
- async list_table_metadata(catalog_name: str | None = None, schema_name: str | None = None, expression: str | None = None, max_results: int | None = None) list[AthenaTableMetadata]¶
- property result_set: AthenaResultSet | None¶
- property rowcount: int¶
Get the number of rows affected by the last operation.
For SELECT statements, this returns -1 as per DB API 2.0 specification. For DML operations (INSERT, UPDATE, DELETE) and CTAS, this returns the number of affected rows.
- Returns:
The number of rows, or -1 if not applicable or unknown.
- setinputsizes(sizes)¶
Does nothing by default
- setoutputsize(size, column=None)¶
Does nothing by default
Aio Polars Cursor¶
- class pyathena.aio.polars.cursor.AioPolarsCursor(s3_staging_dir: str | None = None, schema_name: str | None = None, catalog_name: str | None = None, work_group: str | None = None, poll_interval: float = 1, encryption_option: str | None = None, kms_key: str | None = None, kill_on_interrupt: bool = True, unload: bool = False, result_reuse_enable: bool = False, result_reuse_minutes: int = 60, block_size: int | None = None, cache_type: str | None = None, max_workers: int = 20, chunksize: int | None = None, **kwargs)[source]¶
Native asyncio cursor that returns results as Polars DataFrames.
Uses
asyncio.to_thread()for both result set creation and fetch operations, keeping the event loop free. This is especially important whenchunksizeis set, as fetch calls trigger lazy S3 reads.Example
>>> async with await pyathena.aio_connect(...) as conn: ... cursor = conn.cursor(AioPolarsCursor) ... await cursor.execute("SELECT * FROM my_table") ... df = cursor.as_polars()
- __init__(s3_staging_dir: str | None = None, schema_name: str | None = None, catalog_name: str | None = None, work_group: str | None = None, poll_interval: float = 1, encryption_option: str | None = None, kms_key: str | None = None, kill_on_interrupt: bool = True, unload: bool = False, result_reuse_enable: bool = False, result_reuse_minutes: int = 60, block_size: int | None = None, cache_type: str | None = None, max_workers: int = 20, chunksize: int | None = None, **kwargs) None[source]¶
- static get_default_converter(unload: bool = False) DefaultPolarsTypeConverter | DefaultPolarsUnloadTypeConverter | Any[source]¶
Get the default type converter for this cursor class.
- Parameters:
unload – Whether the converter is for UNLOAD operations. Some cursor types may return different converters for UNLOAD operations.
- Returns:
The default type converter instance for this cursor type.
- async execute(operation: str, parameters: dict[str, Any] | list[str] | None = None, work_group: str | None = None, s3_staging_dir: str | None = None, cache_size: int | None = 0, cache_expiration_time: int | None = 0, result_reuse_enable: bool | None = None, result_reuse_minutes: int | None = None, paramstyle: str | None = None, **kwargs) AioPolarsCursor[source]¶
Execute a SQL query asynchronously and return results as Polars DataFrames.
- Parameters:
operation – SQL query string to execute.
parameters – Query parameters for parameterized queries.
work_group – Athena workgroup to use for this query.
s3_staging_dir – S3 location for query results.
cache_size – Number of queries to check for result caching.
cache_expiration_time – Cache expiration time in seconds.
result_reuse_enable – Enable Athena result reuse for this query.
result_reuse_minutes – Minutes to reuse cached results.
paramstyle – Parameter style (‘qmark’ or ‘pyformat’).
**kwargs – Additional execution parameters passed to Polars read functions.
- Returns:
Self reference for method chaining.
- async fetchone() tuple[Any | None, ...] | dict[Any, Any | None] | None[source]¶
Fetch the next row of the result set.
Wraps the synchronous fetch in
asyncio.to_threadto avoid blocking the event loop whenchunksizetriggers lazy S3 reads.- Returns:
A tuple representing the next row, or None if no more rows.
- Raises:
ProgrammingError – If no result set is available.
- async fetchmany(size: int | None = None) list[tuple[Any | None, ...] | dict[Any, Any | None]][source]¶
Fetch multiple rows from the result set.
Wraps the synchronous fetch in
asyncio.to_threadto avoid blocking the event loop whenchunksizetriggers lazy S3 reads.- Parameters:
size – Maximum number of rows to fetch. Defaults to arraysize.
- Returns:
List of tuples representing the fetched rows.
- Raises:
ProgrammingError – If no result set is available.
- async fetchall() list[tuple[Any | None, ...] | dict[Any, Any | None]][source]¶
Fetch all remaining rows from the result set.
Wraps the synchronous fetch in
asyncio.to_threadto avoid blocking the event loop whenchunksizetriggers lazy S3 reads.- Returns:
List of tuples representing all remaining rows.
- Raises:
ProgrammingError – If no result set is available.
- as_polars() pl.DataFrame[source]¶
Return query results as a Polars DataFrame.
- Returns:
Polars DataFrame containing all query results.
- as_arrow() Table[source]¶
Return query results as an Apache Arrow Table.
- Returns:
Apache Arrow Table containing all query results.
- DEFAULT_RESULT_REUSE_MINUTES = 60¶
- LIST_DATABASES_MAX_RESULTS = 50¶
- LIST_QUERY_EXECUTIONS_MAX_RESULTS = 50¶
- LIST_TABLE_METADATA_MAX_RESULTS = 50¶
- async cancel() None¶
Cancel the currently executing query.
- Raises:
ProgrammingError – If no query is currently executing.
- property connection: Connection[Any]¶
- async executemany(operation: str, seq_of_parameters: list[dict[str, Any] | list[str] | None], **kwargs) None¶
Execute a SQL query multiple times with different parameters.
- Parameters:
operation – SQL query string to execute.
seq_of_parameters – Sequence of parameter sets, one per execution.
**kwargs – Additional keyword arguments passed to each
execute().
- async get_table_metadata(table_name: str, catalog_name: str | None = None, schema_name: str | None = None, logging_: bool = True) AthenaTableMetadata¶
- async list_databases(catalog_name: str | None, max_results: int | None = None) list[AthenaDatabase]¶
- async list_table_metadata(catalog_name: str | None = None, schema_name: str | None = None, expression: str | None = None, max_results: int | None = None) list[AthenaTableMetadata]¶
- property result_set: AthenaResultSet | None¶
- property rowcount: int¶
Get the number of rows affected by the last operation.
For SELECT statements, this returns -1 as per DB API 2.0 specification. For DML operations (INSERT, UPDATE, DELETE) and CTAS, this returns the number of affected rows.
- Returns:
The number of rows, or -1 if not applicable or unknown.
- setinputsizes(sizes)¶
Does nothing by default
- setoutputsize(size, column=None)¶
Does nothing by default
Aio S3FS Cursor¶
- class pyathena.aio.s3fs.cursor.AioS3FSCursor(s3_staging_dir: str | None = None, schema_name: str | None = None, catalog_name: str | None = None, work_group: str | None = None, poll_interval: float = 1, encryption_option: str | None = None, kms_key: str | None = None, kill_on_interrupt: bool = True, result_reuse_enable: bool = False, result_reuse_minutes: int = 60, csv_reader: type[DefaultCSVReader] | type[AthenaCSVReader] | None = None, **kwargs)[source]¶
Native asyncio cursor that reads CSV results via AioS3FileSystem.
Uses
AioS3FileSystemfor S3 operations, which replacesThreadPoolExecutorparallelism withasyncio.gather+asyncio.to_thread. Fetch operations are wrapped inasyncio.to_thread()because CSV reading is blocking I/O.Example
>>> async with await pyathena.aio_connect(...) as conn: ... cursor = conn.cursor(AioS3FSCursor) ... await cursor.execute("SELECT * FROM my_table") ... row = await cursor.fetchone()
- __init__(s3_staging_dir: str | None = None, schema_name: str | None = None, catalog_name: str | None = None, work_group: str | None = None, poll_interval: float = 1, encryption_option: str | None = None, kms_key: str | None = None, kill_on_interrupt: bool = True, result_reuse_enable: bool = False, result_reuse_minutes: int = 60, csv_reader: type[DefaultCSVReader] | type[AthenaCSVReader] | None = None, **kwargs) None[source]¶
- static get_default_converter(unload: bool = False) DefaultS3FSTypeConverter[source]¶
Get the default type converter for S3FS cursor.
- Parameters:
unload – Unused. S3FS cursor does not support UNLOAD operations.
- Returns:
DefaultS3FSTypeConverter instance.
- async execute(operation: str, parameters: dict[str, Any] | list[str] | None = None, work_group: str | None = None, s3_staging_dir: str | None = None, cache_size: int | None = 0, cache_expiration_time: int | None = 0, result_reuse_enable: bool | None = None, result_reuse_minutes: int | None = None, paramstyle: str | None = None, **kwargs) AioS3FSCursor[source]¶
Execute a SQL query asynchronously via S3FileSystem CSV reader.
- Parameters:
operation – SQL query string to execute.
parameters – Query parameters for parameterized queries.
work_group – Athena workgroup to use for this query.
s3_staging_dir – S3 location for query results.
cache_size – Number of queries to check for result caching.
cache_expiration_time – Cache expiration time in seconds.
result_reuse_enable – Enable Athena result reuse for this query.
result_reuse_minutes – Minutes to reuse cached results.
paramstyle – Parameter style (‘qmark’ or ‘pyformat’).
**kwargs – Additional execution parameters.
- Returns:
Self reference for method chaining.
- async fetchone() tuple[Any | None, ...] | dict[Any, Any | None] | None[source]¶
Fetch the next row of the result set.
Wraps the synchronous fetch in
asyncio.to_threadbecauseAthenaS3FSResultSetreads rows lazily from S3.- Returns:
A tuple representing the next row, or None if no more rows.
- Raises:
ProgrammingError – If no result set is available.
- async fetchmany(size: int | None = None) list[tuple[Any | None, ...] | dict[Any, Any | None]][source]¶
Fetch multiple rows from the result set.
Wraps the synchronous fetch in
asyncio.to_threadbecauseAthenaS3FSResultSetreads rows lazily from S3.- Parameters:
size – Maximum number of rows to fetch. Defaults to arraysize.
- Returns:
List of tuples representing the fetched rows.
- Raises:
ProgrammingError – If no result set is available.
- async fetchall() list[tuple[Any | None, ...] | dict[Any, Any | None]][source]¶
Fetch all remaining rows from the result set.
Wraps the synchronous fetch in
asyncio.to_threadbecauseAthenaS3FSResultSetreads rows lazily from S3.- Returns:
List of tuples representing all remaining rows.
- Raises:
ProgrammingError – If no result set is available.
- DEFAULT_RESULT_REUSE_MINUTES = 60¶
- LIST_DATABASES_MAX_RESULTS = 50¶
- LIST_QUERY_EXECUTIONS_MAX_RESULTS = 50¶
- LIST_TABLE_METADATA_MAX_RESULTS = 50¶
- async cancel() None¶
Cancel the currently executing query.
- Raises:
ProgrammingError – If no query is currently executing.
- property connection: Connection[Any]¶
- async executemany(operation: str, seq_of_parameters: list[dict[str, Any] | list[str] | None], **kwargs) None¶
Execute a SQL query multiple times with different parameters.
- Parameters:
operation – SQL query string to execute.
seq_of_parameters – Sequence of parameter sets, one per execution.
**kwargs – Additional keyword arguments passed to each
execute().
- async get_table_metadata(table_name: str, catalog_name: str | None = None, schema_name: str | None = None, logging_: bool = True) AthenaTableMetadata¶
- async list_databases(catalog_name: str | None, max_results: int | None = None) list[AthenaDatabase]¶
- async list_table_metadata(catalog_name: str | None = None, schema_name: str | None = None, expression: str | None = None, max_results: int | None = None) list[AthenaTableMetadata]¶
- property result_set: AthenaResultSet | None¶
- property rowcount: int¶
Get the number of rows affected by the last operation.
For SELECT statements, this returns -1 as per DB API 2.0 specification. For DML operations (INSERT, UPDATE, DELETE) and CTAS, this returns the number of affected rows.
- Returns:
The number of rows, or -1 if not applicable or unknown.
- setinputsizes(sizes)¶
Does nothing by default
- setoutputsize(size, column=None)¶
Does nothing by default
Aio Spark Cursor¶
- class pyathena.aio.spark.cursor.AioSparkCursor(session_id: str | None = None, description: str | None = None, engine_configuration: dict[str, Any] | None = None, notebook_version: str | None = None, session_idle_timeout_minutes: int | None = None, **kwargs)[source]¶
Native asyncio cursor for executing PySpark code on Athena.
Overrides post-init I/O methods of
SparkBaseCursorwith async equivalents. Session management (_exists_session,_start_session, etc.) stays synchronous because__init__runs insideasyncio.to_thread.Since
SparkBaseCursor.__init__performs I/O (session management), cursor creation must be wrapped inasyncio.to_thread:cursor = await asyncio.to_thread(conn.cursor)
Example
>>> import asyncio >>> async with await pyathena.aio_connect( ... work_group="spark-workgroup", ... cursor_class=AioSparkCursor, ... ) as conn: ... cursor = await asyncio.to_thread(conn.cursor) ... await cursor.execute("spark.sql('SELECT 1').show()") ... print(await cursor.get_std_out())
- __init__(session_id: str | None = None, description: str | None = None, engine_configuration: dict[str, Any] | None = None, notebook_version: str | None = None, session_idle_timeout_minutes: int | None = None, **kwargs) None[source]¶
- property calculation_execution: AthenaCalculationExecution | None¶
- async get_std_out() str | None[source]¶
Get the standard output from the Spark calculation execution.
- Returns:
The standard output as a string, or None if no output is available.
- async get_std_error() str | None[source]¶
Get the standard error from the Spark calculation execution.
- Returns:
The standard error as a string, or None if no error output is available.
- async execute(operation: str, parameters: dict[str, Any] | list[str] | None = None, session_id: str | None = None, description: str | None = None, client_request_token: str | None = None, work_group: str | None = None, **kwargs) AioSparkCursor[source]¶
Execute PySpark code asynchronously.
- Parameters:
operation – PySpark code to execute.
parameters – Unused, kept for API compatibility.
session_id – Spark session ID override.
description – Calculation description.
client_request_token – Idempotency token.
work_group – Unused, kept for API compatibility.
**kwargs – Additional parameters.
- Returns:
Self reference for method chaining.
- async cancel() None[source]¶
Cancel the currently running calculation.
- Raises:
ProgrammingError – If no calculation is running.
- async executemany(operation: str, seq_of_parameters: list[dict[str, Any] | list[str] | None], **kwargs) None[source]¶
- LIST_DATABASES_MAX_RESULTS = 50¶
- LIST_QUERY_EXECUTIONS_MAX_RESULTS = 50¶
- LIST_TABLE_METADATA_MAX_RESULTS = 50¶
- property connection: Connection[Any]¶
- static get_default_converter(unload: bool = False) DefaultTypeConverter | Any¶
Get the default type converter for this cursor class.
- Parameters:
unload – Whether the converter is for UNLOAD operations. Some cursor types may return different converters for UNLOAD operations.
- Returns:
The default type converter instance for this cursor type.
- get_table_metadata(table_name: str, catalog_name: str | None = None, schema_name: str | None = None, logging_: bool = True) AthenaTableMetadata¶
- list_table_metadata(catalog_name: str | None = None, schema_name: str | None = None, expression: str | None = None, max_results: int | None = None) list[AthenaTableMetadata]¶
- setinputsizes(sizes)¶
Does nothing by default
- setoutputsize(size, column=None)¶
Does nothing by default