great_expectations.datasource.sqlalchemy_datasource

Module Contents

Classes

SqlAlchemyDatasource(name=’default’, data_context=None, data_asset_type=None, credentials=None, batch_kwargs_generators=None, **kwargs)

A SqlAlchemyDatasource will provide data_assets converting batch_kwargs using the following rules:

great_expectations.datasource.sqlalchemy_datasource.logger
great_expectations.datasource.sqlalchemy_datasource.sqlalchemy
great_expectations.datasource.sqlalchemy_datasource.datasource_initialization_exceptions
class great_expectations.datasource.sqlalchemy_datasource.SqlAlchemyDatasource(name='default', data_context=None, data_asset_type=None, credentials=None, batch_kwargs_generators=None, **kwargs)

Bases: great_expectations.datasource.LegacyDatasource

A SqlAlchemyDatasource will provide data_assets converting batch_kwargs using the following rules:
  • if the batch_kwargs include a table key, the datasource will provide a dataset object connected to that table

  • if the batch_kwargs include a query key, the datasource will create a temporary table usingthat query. The query can be parameterized according to the standard python Template engine, which uses $parameter, with additional kwargs passed to the get_batch method.

Feature Maturity

icon-dcf04d6435b311ebbd000242ac110002 Datasource - PostgreSQL - How-to Guide
Support for using the open source PostgresQL database as an external datasource and execution engine.
Maturity: Production
Details:
API Stability: High
Implementation Completeness: Complete
Unit Test Coverage: Complete
Integration Infrastructure/Test Coverage: Complete
Documentation Completeness: Medium (does not have a specific how-to, but easy to use overall)
Bug Risk: Low
Expectation Completeness: Moderate
icon-dcf04f5835b311ebbd000242ac110002 Datasource - BigQuery - How-to Guide
Use Google BigQuery as an execution engine and external datasource to validate data.
Maturity: Beta
Details:
API Stability: Unstable (table generator inability to work with triple-dotted, temp table usability, init flow calls setup “other”)
Implementation Completeness: Moderate
Unit Test Coverage: Partial (no test coverage for temp table creation)
Integration Infrastructure/Test Coverage: Minimal
Documentation Completeness: Partial (how-to does not cover all cases)
Bug Risk: High (we know of several bugs, including inability to list tables, SQLAlchemy URL incomplete)
Expectation Completeness: Moderate
icon-dcf0505235b311ebbd000242ac110002 Datasource - Amazon Redshift - How-to Guide
Use Amazon Redshift as an execution engine and external datasource to validate data.
Maturity: Beta
Details:
API Stability: Moderate (potential metadata/introspection method special handling for performance)
Implementation Completeness: Complete
Unit Test Coverage: Minimal
Integration Infrastructure/Test Coverage: Minimal (none automated)
Documentation Completeness: Moderate
Bug Risk: Moderate
Expectation Completeness: Moderate
icon-dcf0512e35b311ebbd000242ac110002 Datasource - Snowflake - How-to Guide
Use Snowflake Computing as an execution engine and external datasource to validate data.
Maturity: Production
Details:
API Stability: High
Implementation Completeness: Complete
Unit Test Coverage: Complete
Integration Infrastructure/Test Coverage: Minimal (manual only)
Documentation Completeness: Complete
Bug Risk: Low
Expectation Completeness: Complete
icon-dcf051f635b311ebbd000242ac110002 Datasource - Microsoft SQL Server - How-to Guide
Use Microsoft SQL Server as an execution engine and external datasource to validate data.
Maturity: Experimental
Details:
API Stability: High
Implementation Completeness: Moderate
Unit Test Coverage: Minimal (none)
Integration Infrastructure/Test Coverage: Minimal (none)
Documentation Completeness: Minimal
Bug Risk: High
Expectation Completeness: Low (some required queries do not generate properly, such as related to nullity)
icon-dcf052c835b311ebbd000242ac110002 Datasource - MySQL - How-to Guide
Use MySQL as an execution engine and external datasource to validate data.
Maturity: Experimental
Details:
API Stability: Low (no consideration for temp tables)
Implementation Completeness: Low (no consideration for temp tables)
Unit Test Coverage: Minimal (none)
Integration Infrastructure/Test Coverage: Minimal (none)
Documentation Completeness: Minimal (none)
Bug Risk: Unknown
Expectation Completeness: Unknown
icon-dcf0538635b311ebbd000242ac110002 Datasource - MariaDB - How-to Guide
Use MariaDB as an execution engine and external datasource to validate data.
Maturity: Experimental
Details:
API Stability: Low (no consideration for temp tables)
Implementation Completeness: Low (no consideration for temp tables)
Unit Test Coverage: Minimal (none)
Integration Infrastructure/Test Coverage: Minimal (none)
Documentation Completeness: Minimal (none)
Bug Risk: Unknown
Expectation Completeness: Unknown
recognized_batch_parameters
classmethod build_configuration(cls, data_asset_type=None, batch_kwargs_generators=None, **kwargs)

Build a full configuration object for a datasource, potentially including generators with defaults.

Parameters
  • data_asset_type – A ClassConfig dictionary

  • batch_kwargs_generators – Generator configuration dictionary

  • **kwargs – Additional kwargs to be part of the datasource constructor’s initialization

Returns

A complete datasource configuration.

_get_sqlalchemy_connection_options(self, **kwargs)
_get_sqlalchemy_key_pair_auth_url(self, drivername, credentials)
get_batch(self, batch_kwargs, batch_parameters=None)

Get a batch of data from the datasource.

Parameters
  • batch_kwargs – the BatchKwargs to use to construct the batch

  • batch_parameters – optional parameters to store as the reference description of the batch. They should reflect parameters that would provide the passed BatchKwargs.

Returns

Batch

process_batch_parameters(self, query_parameters=None, limit=None, dataset_options=None)

Use datasource-specific configuration to translate any batch parameters into batch kwargs at the datasource level.

Parameters
  • limit (int) – a parameter all datasources must accept to allow limiting a batch to a smaller number of rows.

  • dataset_options (dict) – a set of kwargs that will be passed to the constructor of a dataset built using these batch_kwargs

Returns

Result will include both parameters passed via argument and configured parameters.

Return type

batch_kwargs