Data Models¶

This section covers Athena query execution models and configuration classes.

Query Execution¶

class pyathena.model.AthenaQueryExecution(response: dict[str, Any])[source]¶

Represents an Athena query execution with status and metadata.

This class encapsulates information about a query execution in Amazon Athena, including its current state, statistics, error information, and result metadata. It’s primarily used internally by PyAthena cursors but can be useful for monitoring and debugging query execution.

Query States:

QUEUED: Query is waiting to be executed
RUNNING: Query is currently executing
SUCCEEDED: Query completed successfully
FAILED: Query execution failed
CANCELLED: Query was cancelled

Statement Types:

DDL: Data Definition Language (CREATE, DROP, ALTER)
DML: Data Manipulation Language (SELECT, INSERT, UPDATE, DELETE)
UTILITY: Utility statements (SHOW, DESCRIBE, EXPLAIN)

Example

>>> # Typically accessed through cursor execution
>>> cursor.execute("SELECT COUNT(*) FROM my_table")
>>> query_execution = cursor._last_query_execution  # Internal access
>>> print(f"Query ID: {query_execution.query_id}")
>>> print(f"State: {query_execution.state}")
>>> print(f"Data scanned: {query_execution.data_scanned_in_bytes} bytes")

See also

AWS Athena QueryExecution API reference: https://docs.aws.amazon.com/athena/latest/APIReference/API_QueryExecution.html

STATE_QUEUED: str = 'QUEUED'¶

STATE_RUNNING: str = 'RUNNING'¶

STATE_SUCCEEDED: str = 'SUCCEEDED'¶

STATE_FAILED: str = 'FAILED'¶

STATE_CANCELLED: str = 'CANCELLED'¶

STATEMENT_TYPE_DDL: str = 'DDL'¶

STATEMENT_TYPE_DML: str = 'DML'¶

STATEMENT_TYPE_UTILITY: str = 'UTILITY'¶

ENCRYPTION_OPTION_SSE_S3: str = 'SSE_S3'¶

ENCRYPTION_OPTION_SSE_KMS: str = 'SSE_KMS'¶

ENCRYPTION_OPTION_CSE_KMS: str = 'CSE_KMS'¶

ERROR_CATEGORY_SYSTEM: int = 1¶

ERROR_CATEGORY_USER: int = 2¶

ERROR_CATEGORY_OTHER: int = 3¶

S3_ACL_OPTION_BUCKET_OWNER_FULL_CONTROL = 'BUCKET_OWNER_FULL_CONTROL'¶

__init__(response: dict[str, Any]) → None[source]¶

property database: str | None¶

property catalog: str | None¶

property query_id: str | None¶

property query: str | None¶

property statement_type: str | None¶

property substatement_type: str | None¶

property work_group: str | None¶

property execution_parameters: list[str]¶

property state: str | None¶

property state_change_reason: str | None¶

property submission_date_time: datetime | None¶

property completion_date_time: datetime | None¶

property error_category: int | None¶

property error_type: int | None¶

property retryable: bool | None¶

property error_message: str | None¶

property data_scanned_in_bytes: int | None¶

property engine_execution_time_in_millis: int | None¶

property query_queue_time_in_millis: int | None¶

property total_execution_time_in_millis: int | None¶

property query_planning_time_in_millis: int | None¶

property service_pre_processing_time_in_millis: int | None¶

property service_processing_time_in_millis: int | None¶

property dpu_count: float | None¶

property output_location: str | None¶

property data_manifest_location: str | None¶

property reused_previous_result: bool | None¶

property encryption_option: str | None¶

property kms_key: str | None¶

property expected_bucket_owner: str | None¶

property s3_acl_option: str | None¶

property selected_engine_version: str | None¶

property effective_engine_version: str | None¶

property result_reuse_enabled: bool | None¶

property result_reuse_minutes: int | None¶

property managed_query_results_enabled: bool | None¶

property managed_query_results_kms_key: str | None¶

property enable_s3_access_grants: bool | None¶

property create_user_level_prefix: bool | None¶

property s3_access_grants_authentication_type: str | None¶

class pyathena.model.AthenaCalculationExecution(response: dict[str, Any])[source]¶

Represents a complete Athena calculation execution with status and results.

This class extends AthenaCalculationExecutionStatus to include additional information about the calculation execution, including session details, working directory, and result locations in S3.

Attributes are inherited from AthenaCalculationExecutionStatus for state and timing information.

See also

AWS Athena CalculationExecution API reference: https://docs.aws.amazon.com/athena/latest/APIReference/API_CalculationSummary.html

__init__(response: dict[str, Any]) → None[source]¶

property calculation_id: str | None¶

property session_id: str | None¶

property description: str | None¶

property working_directory: str | None¶

property std_out_s3_uri: str | None¶

property std_error_s3_uri: str | None¶

property result_s3_uri: str | None¶

property result_type: str | None¶

class pyathena.model.AthenaCalculationExecutionStatus(response: dict[str, Any])[source]¶

Status information for an Athena calculation execution.

This class represents the current state and statistics of a calculation execution in Amazon Athena’s notebook or interactive session environment. It tracks the calculation’s lifecycle from creation through completion.

Calculation States:

CREATING: Calculation is being created
CREATED: Calculation has been created
QUEUED: Calculation is waiting to execute
RUNNING: Calculation is currently executing
CANCELING: Calculation is being cancelled
CANCELED: Calculation was cancelled
COMPLETED: Calculation completed successfully
FAILED: Calculation execution failed

See also

AWS Athena CalculationExecutionStatus API reference: https://docs.aws.amazon.com/athena/latest/APIReference/API_CalculationStatus.html

STATE_CREATING: str = 'CREATING'¶

STATE_CREATED: str = 'CREATED'¶

STATE_QUEUED: str = 'QUEUED'¶

STATE_RUNNING: str = 'RUNNING'¶

STATE_CANCELING: str = 'CANCELING'¶

STATE_CANCELED: str = 'CANCELED'¶

STATE_COMPLETED: str = 'COMPLETED'¶

STATE_FAILED: str = 'FAILED'¶

__init__(response: dict[str, Any]) → None[source]¶

property state: str | None¶

property state_change_reason: str | None¶

property submission_date_time: datetime | None¶

property completion_date_time: datetime | None¶

property dpu_execution_in_millis: int | None¶

property progress: str | None¶

Session Management¶

class pyathena.model.AthenaSessionStatus(response: dict[str, Any])[source]¶

Status information for an Athena interactive session.

This class represents the current state of an interactive session in Amazon Athena, used for notebook and Spark workloads. Sessions provide a persistent environment for running multiple calculations.

Session States:

CREATING: Session is being created
CREATED: Session has been created
IDLE: Session is idle and ready for calculations
BUSY: Session is executing a calculation
TERMINATING: Session is being terminated
TERMINATED: Session has been terminated
DEGRADED: Session is in a degraded state
FAILED: Session creation or execution failed

See also

AWS Athena Session API reference: https://docs.aws.amazon.com/athena/latest/APIReference/API_SessionStatus.html

STATE_CREATING: str = 'CREATING'¶

STATE_CREATED: str = 'CREATED'¶

STATE_IDLE: str = 'IDLE'¶

STATE_BUSY: str = 'BUSY'¶

STATE_TERMINATING: str = 'TERMINATING'¶

STATE_TERMINATED: str = 'TERMINATED'¶

STATE_DEGRADED: str = 'DEGRADED'¶

STATE_FAILED: str = 'FAILED'¶

__init__(response: dict[str, Any]) → None[source]¶

property session_id: str | None¶

property state: str | None¶

property state_change_reason: str | None¶

property start_date_time: datetime | None¶

property last_modified_date_time: datetime | None¶

property end_date_time: datetime | None¶

property idle_since_date_time: datetime | None¶

Database and Table Metadata¶

class pyathena.model.AthenaDatabase(response)[source]¶

Represents an Athena database (schema) and its metadata.

This class encapsulates information about a database in the AWS Glue Data Catalog that is accessible through Amazon Athena. Databases serve as containers for tables and views.

See also

AWS Athena Database API reference: https://docs.aws.amazon.com/athena/latest/APIReference/API_Database.html

__init__(response)[source]¶

property name: str | None¶

property description: str | None¶

property parameters: dict[str, str]¶

class pyathena.model.AthenaTableMetadata(response)[source]¶

Represents comprehensive metadata for an Athena table.

This class contains detailed information about a table in the AWS Glue Data Catalog, including columns, partition keys, storage format, serialization library, and various table properties.

The class provides convenient properties for accessing common table attributes like location, file format, compression, and SerDe configuration.

See also

AWS Athena TableMetadata API reference: https://docs.aws.amazon.com/athena/latest/APIReference/API_TableMetadata.html

__init__(response)[source]¶

property name: str | None¶

property create_time: datetime | None¶

property last_access_time: datetime | None¶

property table_type: str | None¶

property columns: list[AthenaTableMetadataColumn]¶

property partition_keys: list[AthenaTableMetadataPartitionKey]¶

property parameters: dict[str, str]¶

property comment: str | None¶

property location: str | None¶

property input_format: str | None¶

property output_format: str | None¶

property row_format: str | None¶

property file_format: str | None¶

property serde_serialization_lib: str | None¶

property compression: str | None¶

property serde_properties: dict[str, str]¶

property table_properties: dict[str, str]¶

File Formats and Compression¶

class pyathena.model.AthenaFileFormat[source]¶

Constants and utilities for Athena supported file formats.

This class provides constants for file formats supported by Amazon Athena and utility methods to check format types. These are commonly used when creating tables or configuring UNLOAD operations.

Supported formats:

SEQUENCEFILE: Hadoop SequenceFile format
TEXTFILE: Plain text files (default)
RCFILE: Record Columnar File format
ORC: Optimized Row Columnar format
PARQUET: Apache Parquet columnar format
AVRO: Apache Avro format
ION: Amazon Ion format

Example

>>> from pyathena.model import AthenaFileFormat
>>>
>>> # Check if format is Parquet
>>> if AthenaFileFormat.is_parquet("PARQUET"):
...     print("Using columnar format")
>>>
>>> # Use in UNLOAD operations
>>> format_type = AthenaFileFormat.FILE_FORMAT_PARQUET
>>> sql = f"UNLOAD (...) TO 's3://bucket/path/' WITH (format = '{format_type}')"
>>> cursor.execute(sql)

See also

AWS Documentation on supported file formats: https://docs.aws.amazon.com/athena/latest/ug/supported-serdes.html

FILE_FORMAT_SEQUENCEFILE: str = 'SEQUENCEFILE'¶

FILE_FORMAT_TEXTFILE: str = 'TEXTFILE'¶

FILE_FORMAT_RCFILE: str = 'RCFILE'¶

FILE_FORMAT_ORC: str = 'ORC'¶

FILE_FORMAT_PARQUET: str = 'PARQUET'¶

FILE_FORMAT_AVRO: str = 'AVRO'¶

FILE_FORMAT_ION: str = 'ION'¶

static is_parquet(value: str) → bool[source]¶

static is_orc(value: str) → bool[source]¶

class pyathena.model.AthenaCompression[source]¶

Constants and utilities for Athena supported compression formats.

This class provides constants for compression formats supported by Amazon Athena and utility methods to validate compression types. These are commonly used when creating tables, configuring UNLOAD operations, or optimizing data storage.

Supported compression formats:

BZIP2: BZIP2 compression
DEFLATE: DEFLATE compression
GZIP: GZIP compression (most common)
LZ4: LZ4 fast compression
LZO: LZO compression
SNAPPY: Snappy compression (good for Parquet)
ZLIB: ZLIB compression
ZSTD: Zstandard compression

Example

>>> from pyathena.model import AthenaCompression
>>>
>>> # Validate compression format
>>> if AthenaCompression.is_valid("GZIP"):
...     print("Valid compression format")
>>>
>>> # Use in UNLOAD operations
>>> compression = AthenaCompression.COMPRESSION_GZIP
>>> sql = f"UNLOAD (...) TO 's3://bucket/path/' WITH (compression = '{compression}')"
>>> cursor.execute(sql)

See also

AWS Documentation on compression formats: https://docs.aws.amazon.com/athena/latest/ug/compression-formats.html

Best practices for data compression in Athena: https://docs.aws.amazon.com/athena/latest/ug/compression-support.html

COMPRESSION_BZIP2: str = 'BZIP2'¶

COMPRESSION_DEFLATE: str = 'DEFLATE'¶

COMPRESSION_GZIP: str = 'GZIP'¶

COMPRESSION_LZ4: str = 'LZ4'¶

COMPRESSION_LZO: str = 'LZO'¶

COMPRESSION_SNAPPY: str = 'SNAPPY'¶

COMPRESSION_ZLIB: str = 'ZLIB'¶

COMPRESSION_ZSTD: str = 'ZSTD'¶

static is_valid(value: str) → bool[source]¶