Data Models

This section covers Athena query execution models and configuration classes.

Query Execution

class pyathena.model.AthenaQueryExecution(response: dict[str, Any])[source]

Represents an Athena query execution with status and metadata.

This class encapsulates information about a query execution in Amazon Athena, including its current state, statistics, error information, and result metadata. It’s primarily used internally by PyAthena cursors but can be useful for monitoring and debugging query execution.

Query States:
  • QUEUED: Query is waiting to be executed

  • RUNNING: Query is currently executing

  • SUCCEEDED: Query completed successfully

  • FAILED: Query execution failed

  • CANCELLED: Query was cancelled

Statement Types:
  • DDL: Data Definition Language (CREATE, DROP, ALTER)

  • DML: Data Manipulation Language (SELECT, INSERT, UPDATE, DELETE)

  • UTILITY: Utility statements (SHOW, DESCRIBE, EXPLAIN)

Example

>>> # Typically accessed through cursor execution
>>> cursor.execute("SELECT COUNT(*) FROM my_table")
>>> query_execution = cursor._last_query_execution  # Internal access
>>> print(f"Query ID: {query_execution.query_id}")
>>> print(f"State: {query_execution.state}")
>>> print(f"Data scanned: {query_execution.data_scanned_in_bytes} bytes")

See also

AWS Athena QueryExecution API reference: https://docs.aws.amazon.com/athena/latest/APIReference/API_QueryExecution.html

STATE_QUEUED: str = 'QUEUED'
STATE_RUNNING: str = 'RUNNING'
STATE_SUCCEEDED: str = 'SUCCEEDED'
STATE_FAILED: str = 'FAILED'
STATE_CANCELLED: str = 'CANCELLED'
STATEMENT_TYPE_DDL: str = 'DDL'
STATEMENT_TYPE_DML: str = 'DML'
STATEMENT_TYPE_UTILITY: str = 'UTILITY'
ENCRYPTION_OPTION_SSE_S3: str = 'SSE_S3'
ENCRYPTION_OPTION_SSE_KMS: str = 'SSE_KMS'
ENCRYPTION_OPTION_CSE_KMS: str = 'CSE_KMS'
ERROR_CATEGORY_SYSTEM: int = 1
ERROR_CATEGORY_USER: int = 2
ERROR_CATEGORY_OTHER: int = 3
S3_ACL_OPTION_BUCKET_OWNER_FULL_CONTROL = 'BUCKET_OWNER_FULL_CONTROL'
__init__(response: dict[str, Any]) None[source]
property database: str | None
property catalog: str | None
property query_id: str | None
property query: str | None
property statement_type: str | None
property substatement_type: str | None
property work_group: str | None
property execution_parameters: list[str]
property state: str | None
property state_change_reason: str | None
property submission_date_time: datetime | None
property completion_date_time: datetime | None
property error_category: int | None
property error_type: int | None
property retryable: bool | None
property error_message: str | None
property data_scanned_in_bytes: int | None
property engine_execution_time_in_millis: int | None
property query_queue_time_in_millis: int | None
property total_execution_time_in_millis: int | None
property query_planning_time_in_millis: int | None
property service_processing_time_in_millis: int | None
property output_location: str | None
property data_manifest_location: str | None
property reused_previous_result: bool | None
property encryption_option: str | None
property kms_key: str | None
property expected_bucket_owner: str | None
property s3_acl_option: str | None
property selected_engine_version: str | None
property effective_engine_version: str | None
property result_reuse_enabled: bool | None
property result_reuse_minutes: int | None
class pyathena.model.AthenaCalculationExecution(response: dict[str, Any])[source]

Represents a complete Athena calculation execution with status and results.

This class extends AthenaCalculationExecutionStatus to include additional information about the calculation execution, including session details, working directory, and result locations in S3.

Attributes are inherited from AthenaCalculationExecutionStatus for state and timing information.

See also

AWS Athena CalculationExecution API reference: https://docs.aws.amazon.com/athena/latest/APIReference/API_CalculationSummary.html

__init__(response: dict[str, Any]) None[source]
property calculation_id: str | None
property session_id: str | None
property description: str | None
property working_directory: str | None
property std_out_s3_uri: str | None
property std_error_s3_uri: str | None
property result_s3_uri: str | None
property result_type: str | None
class pyathena.model.AthenaCalculationExecutionStatus(response: dict[str, Any])[source]

Status information for an Athena calculation execution.

This class represents the current state and statistics of a calculation execution in Amazon Athena’s notebook or interactive session environment. It tracks the calculation’s lifecycle from creation through completion.

Calculation States:
  • CREATING: Calculation is being created

  • CREATED: Calculation has been created

  • QUEUED: Calculation is waiting to execute

  • RUNNING: Calculation is currently executing

  • CANCELING: Calculation is being cancelled

  • CANCELED: Calculation was cancelled

  • COMPLETED: Calculation completed successfully

  • FAILED: Calculation execution failed

See also

AWS Athena CalculationExecutionStatus API reference: https://docs.aws.amazon.com/athena/latest/APIReference/API_CalculationStatus.html

STATE_CREATING: str = 'CREATING'
STATE_CREATED: str = 'CREATED'
STATE_QUEUED: str = 'QUEUED'
STATE_RUNNING: str = 'RUNNING'
STATE_CANCELING: str = 'CANCELING'
STATE_CANCELED: str = 'CANCELED'
STATE_COMPLETED: str = 'COMPLETED'
STATE_FAILED: str = 'FAILED'
__init__(response: dict[str, Any]) None[source]
property state: str | None
property state_change_reason: str | None
property submission_date_time: datetime | None
property completion_date_time: datetime | None
property dpu_execution_in_millis: int | None
property progress: str | None

Session Management

class pyathena.model.AthenaSessionStatus(response: dict[str, Any])[source]

Status information for an Athena interactive session.

This class represents the current state of an interactive session in Amazon Athena, used for notebook and Spark workloads. Sessions provide a persistent environment for running multiple calculations.

Session States:
  • CREATING: Session is being created

  • CREATED: Session has been created

  • IDLE: Session is idle and ready for calculations

  • BUSY: Session is executing a calculation

  • TERMINATING: Session is being terminated

  • TERMINATED: Session has been terminated

  • DEGRADED: Session is in a degraded state

  • FAILED: Session creation or execution failed

STATE_CREATING: str = 'CREATING'
STATE_CREATED: str = 'CREATED'
STATE_IDLE: str = 'IDLE'
STATE_BUSY: str = 'BUSY'
STATE_TERMINATING: str = 'TERMINATING'
STATE_TERMINATED: str = 'TERMINATED'
STATE_DEGRADED: str = 'DEGRADED'
STATE_FAILED: str = 'FAILED'
__init__(response: dict[str, Any]) None[source]
property session_id: str | None
property state: str | None
property state_change_reason: str | None
property start_date_time: datetime | None
property last_modified_date_time: datetime | None
property end_date_time: datetime | None
property idle_since_date_time: datetime | None

Database and Table Metadata

class pyathena.model.AthenaDatabase(response)[source]

Represents an Athena database (schema) and its metadata.

This class encapsulates information about a database in the AWS Glue Data Catalog that is accessible through Amazon Athena. Databases serve as containers for tables and views.

__init__(response)[source]
property name: str | None
property description: str | None
property parameters: dict[str, str]
class pyathena.model.AthenaTableMetadata(response)[source]

Represents comprehensive metadata for an Athena table.

This class contains detailed information about a table in the AWS Glue Data Catalog, including columns, partition keys, storage format, serialization library, and various table properties.

The class provides convenient properties for accessing common table attributes like location, file format, compression, and SerDe configuration.

__init__(response)[source]
property name: str | None
property create_time: datetime | None
property last_access_time: datetime | None
property table_type: str | None
property columns: list[AthenaTableMetadataColumn]
property partition_keys: list[AthenaTableMetadataPartitionKey]
property parameters: dict[str, str]
property comment: str | None
property location: str | None
property input_format: str | None
property output_format: str | None
property row_format: str | None
property file_format: str | None
property serde_serialization_lib: str | None
property compression: str | None
property serde_properties: dict[str, str]
property table_properties: dict[str, str]

File Formats and Compression

class pyathena.model.AthenaFileFormat[source]

Constants and utilities for Athena supported file formats.

This class provides constants for file formats supported by Amazon Athena and utility methods to check format types. These are commonly used when creating tables or configuring UNLOAD operations.

Supported formats:
  • SEQUENCEFILE: Hadoop SequenceFile format

  • TEXTFILE: Plain text files (default)

  • RCFILE: Record Columnar File format

  • ORC: Optimized Row Columnar format

  • PARQUET: Apache Parquet columnar format

  • AVRO: Apache Avro format

  • ION: Amazon Ion format

Example

>>> from pyathena.model import AthenaFileFormat
>>>
>>> # Check if format is Parquet
>>> if AthenaFileFormat.is_parquet("PARQUET"):
...     print("Using columnar format")
>>>
>>> # Use in UNLOAD operations
>>> format_type = AthenaFileFormat.FILE_FORMAT_PARQUET
>>> sql = f"UNLOAD (...) TO 's3://bucket/path/' WITH (format = '{format_type}')"
>>> cursor.execute(sql)

See also

AWS Documentation on supported file formats: https://docs.aws.amazon.com/athena/latest/ug/supported-serdes.html

FILE_FORMAT_SEQUENCEFILE: str = 'SEQUENCEFILE'
FILE_FORMAT_TEXTFILE: str = 'TEXTFILE'
FILE_FORMAT_RCFILE: str = 'RCFILE'
FILE_FORMAT_ORC: str = 'ORC'
FILE_FORMAT_PARQUET: str = 'PARQUET'
FILE_FORMAT_AVRO: str = 'AVRO'
FILE_FORMAT_ION: str = 'ION'
static is_parquet(value: str) bool[source]
static is_orc(value: str) bool[source]
class pyathena.model.AthenaCompression[source]

Constants and utilities for Athena supported compression formats.

This class provides constants for compression formats supported by Amazon Athena and utility methods to validate compression types. These are commonly used when creating tables, configuring UNLOAD operations, or optimizing data storage.

Supported compression formats:
  • BZIP2: BZIP2 compression

  • DEFLATE: DEFLATE compression

  • GZIP: GZIP compression (most common)

  • LZ4: LZ4 fast compression

  • LZO: LZO compression

  • SNAPPY: Snappy compression (good for Parquet)

  • ZLIB: ZLIB compression

  • ZSTD: Zstandard compression

Example

>>> from pyathena.model import AthenaCompression
>>>
>>> # Validate compression format
>>> if AthenaCompression.is_valid("GZIP"):
...     print("Valid compression format")
>>>
>>> # Use in UNLOAD operations
>>> compression = AthenaCompression.COMPRESSION_GZIP
>>> sql = f"UNLOAD (...) TO 's3://bucket/path/' WITH (compression = '{compression}')"
>>> cursor.execute(sql)

See also

AWS Documentation on compression formats: https://docs.aws.amazon.com/athena/latest/ug/compression-formats.html

Best practices for data compression in Athena: https://docs.aws.amazon.com/athena/latest/ug/compression-support.html

COMPRESSION_BZIP2: str = 'BZIP2'
COMPRESSION_DEFLATE: str = 'DEFLATE'
COMPRESSION_GZIP: str = 'GZIP'
COMPRESSION_LZ4: str = 'LZ4'
COMPRESSION_LZO: str = 'LZO'
COMPRESSION_SNAPPY: str = 'SNAPPY'
COMPRESSION_ZLIB: str = 'ZLIB'
COMPRESSION_ZSTD: str = 'ZSTD'
static is_valid(value: str) bool[source]