DataContext Module

DataContext

class great_expectations.data_context.DataContext(context_root_dir=None, active_environment_name='default', data_asset_name_delimiter='/')

Bases: great_expectations.data_context.data_context.ConfigOnlyDataContext

A DataContext represents a Great Expectations project. It organizes storage and access for expectation suites, datasources, notification settings, and data fixtures.

The DataContext is configured via a yml file stored in a directory called great_expectations; the configuration file as well as managed expectation suites should be stored in version control.

Use the create classmethod to create a new empty config, or instantiate the DataContext by passing the path to an existing data context root directory.

DataContexts use data sources you’re already familiar with. Generators help introspect data stores and data execution frameworks (such as airflow, Nifi, dbt, or dagster) to describe and produce batches of data ready for analysis. This enables fetching, validation, profiling, and documentation of your data in a way that is meaningful within your existing infrastructure and work environment.

DataContexts use a datasource-based namespace, where each accessible type of data has a three-part normalized data_asset_name, consisting of datasource/generator/generator_asset.

  • The datasource actually connects to a source of materialized data and returns Great Expectations DataAssets connected to a compute environment and ready for validation.

  • The Generator knows how to introspect datasources and produce identifying “batch_kwargs” that define particular slices of data.

  • The generator_asset is a specific name – often a table name or other name familiar to users – that generators can slice into batches.

An expectation suite is a collection of expectations ready to be applied to a batch of data. Since in many projects it is useful to have different expectations evaluate in different contexts–profiling vs. testing; warning vs. error; high vs. low compute; ML model or dashboard–suites provide a namespace option for selecting which expectations a DataContext returns.

In many simple projects, the datasource or generator name may be omitted and the DataContext will infer the correct name when there is no ambiguity.

Similarly, if no expectation suite name is provided, the DataContext will assume the name “default”.

add_store(store_name, store_config)

Add a new Store to the DataContext and (for convenience) return the instantiated Store object.

Parameters
  • store_name (str) – a key for the new Store in in self._stores

  • store_config (dict) – a config for the Store to add

Returns

store (Store)

add_datasource(name, **kwargs)

Add a new datasource to the data context, with configuration provided as kwargs. :param name: the name for the new datasource to add :type name: str :param initialize - if False, add the datasource to the config, but do not: initialize it. Example: user needs to debug database connectivity. :param kwargs: the configuration for the new datasource :type kwargs: keyword arguments

Note

the type_ parameter is still supported as a way to add a datasource, but support will be removed in a future release. Please update to using class_name instead.

Returns

datasource (Datasource)

classmethod find_context_root_dir()
classmethod find_context_yml_file(search_start_dir='/home/docs/checkouts/readthedocs.org/user_builds/great-expectations/checkouts/latest/docs')

Search for the yml file starting here and moving upward.

great_expectations.data_context.util.safe_mmkdir(directory, exist_ok=True)

Simple wrapper since exist_ok is not available in python 2

great_expectations.data_context.util.parse_string_to_data_context_resource_identifier(string, separator='.')
great_expectations.data_context.util.load_class(class_name, module_name)

Dynamically load a class from strings or raise a helpful error.

great_expectations.data_context.util.instantiate_class_from_config(config, runtime_config, config_defaults=None)

Build a GE class from configuration dictionaries.

great_expectations.data_context.util.format_dict_for_error_message(dict_)
great_expectations.data_context.util.substitute_config_variable(template_str, config_variables_dict)

This method takes a string, and if it contains a pattern ${SOME_VARIABLE} or $SOME_VARIABLE, returns a string where the pattern is replaced with the value of SOME_VARIABLE, otherwise returns the string unchanged.

If the environment variable SOME_VARIABLE is set, the method uses its value for substitution. If it is not set, the value of SOME_VARIABLE is looked up in the config variables store (file). If it is not found there, the input string is returned as is.

Parameters
  • template_str – a string that might or might not be of the form ${SOME_VARIABLE} or $SOME_VARIABLE

  • config_variables_dict – a dictionary of config variables. It is loaded from the config variables store (by default, “uncommitted/config_variables.yml file)

Returns

great_expectations.data_context.util.substitute_all_config_variables(data, replace_variables_dict)

Substitute all config variables of the form ${SOME_VARIABLE} in a dictionary-like config object for their values.

The method traverses the dictionary recursively.

Parameters
  • data

  • replace_variables_dict

Returns

a dictionary with all the variables replaced with their values