great_expectations.validation_operators

Package Contents

Classes

NoOpAction(data_context)

This is the base class for all actions that act on validation results

OpsgenieAlertAction(data_context, renderer, api_key, region=None, priority=’P3’, notify_on=’failure’)

OpsgenieAlertAction creates and sends an Opsgenie alert

PagerdutyAlertAction(data_context, api_key, routing_key, notify_on=’failure’)

PagerdutyAlertAction sends a pagerduty event

SlackNotificationAction(data_context, renderer, slack_webhook, notify_on=’all’, notify_with=None)

SlackNotificationAction sends a Slack notification to a given webhook.

StoreEvaluationParametersAction(data_context, target_store_name=None)

StoreEvaluationParametersAction extracts evaluation parameters from a validation result and stores them in the store

StoreMetricsAction(data_context, requested_metrics, target_store_name=’metrics_store’)

StoreMetricsAction extracts metrics from a Validation Result and stores them

StoreValidationResultAction(data_context, target_store_name=None)

StoreValidationResultAction stores a validation result in the ValidationsStore.

UpdateDataDocsAction(data_context, site_names=None, target_site_names=None)

UpdateDataDocsAction is a validation action that

ValidationAction(data_context)

This is the base class for all actions that act on validation results

ActionListValidationOperator(data_context, action_list, name, result_format={‘result_format’: ‘SUMMARY’})

ActionListValidationOperator validates each batch in its run method’s assets_to_validate argument against the Expectation Suite included within that batch.

ValidationOperator()

The base class of all validation operators.

WarningAndFailureExpectationSuitesValidationOperator(data_context, action_list, name, base_expectation_suite_name=None, expectation_suite_name_suffixes=None, stop_on_first_error=False, slack_webhook=None, notify_on=’all’, notify_with=None, result_format={‘result_format’: ‘SUMMARY’})

WarningAndFailureExpectationSuitesValidationOperator is a validation operator

Functions

send_slack_notification(query, slack_webhook)

send_opsgenie_alert(query, suite_name, settings)

Creates an alert in Opsgenie.

class great_expectations.validation_operators.NoOpAction(data_context)

Bases: great_expectations.validation_operators.actions.ValidationAction

This is the base class for all actions that act on validation results and are aware of a Data Context namespace structure.

The Data Context is passed to this class in its constructor.

_run(self, validation_result_suite, validation_result_suite_identifier, data_asset)
class great_expectations.validation_operators.OpsgenieAlertAction(data_context, renderer, api_key, region=None, priority='P3', notify_on='failure')

Bases: great_expectations.validation_operators.actions.ValidationAction

OpsgenieAlertAction creates and sends an Opsgenie alert

Configuration

- name: send_opsgenie_alert_on_validation_result
action:
  class_name: OpsgenieAlertAction
  # put the actual webhook URL in the uncommitted/config_variables.yml file
  api_key: ${opsgenie_api_key} # Opsgenie API key
  region: specifies the Opsgenie region. Populate 'EU' for Europe otherwise leave empty
  priority: specify the priority of the alert (P1 - P5) defaults to P3
  notify_on: failure # possible values: "all", "failure", "success"
_run(self, validation_result_suite, validation_result_suite_identifier, data_asset=None, payload=None)
class great_expectations.validation_operators.PagerdutyAlertAction(data_context, api_key, routing_key, notify_on='failure')

Bases: great_expectations.validation_operators.actions.ValidationAction

PagerdutyAlertAction sends a pagerduty event

Configuration

- name: send_pagerduty_alert_on_validation_result
action:
  class_name: PagerdutyAlertAction
  api_key: ${pagerduty_api_key} # Events API v2 key
  routing_key: # The 32 character Integration Key for an integration on a service or on a global ruleset.
  notify_on: failure # possible values: "all", "failure", "success"
_run(self, validation_result_suite, validation_result_suite_identifier, data_asset=None, payload=None)
class great_expectations.validation_operators.SlackNotificationAction(data_context, renderer, slack_webhook, notify_on='all', notify_with=None)

Bases: great_expectations.validation_operators.actions.ValidationAction

SlackNotificationAction sends a Slack notification to a given webhook.

Configuration

- name: send_slack_notification_on_validation_result
action:
  class_name: StoreValidationResultAction
  # put the actual webhook URL in the uncommitted/config_variables.yml file
  slack_webhook: ${validation_notification_slack_webhook}
  notify_on: all # possible values: "all", "failure", "success"
  notify_with: # optional list of DataDocs site names to display in Slack message. Defaults to showing all
  renderer:
    # the class that implements the message to be sent
    # this is the default implementation, but you can
    # implement a custom one
    module_name: great_expectations.render.renderer.slack_renderer
    class_name: SlackRenderer
_run(self, validation_result_suite, validation_result_suite_identifier, data_asset=None, payload=None)
class great_expectations.validation_operators.StoreEvaluationParametersAction(data_context, target_store_name=None)

Bases: great_expectations.validation_operators.actions.ValidationAction

StoreEvaluationParametersAction extracts evaluation parameters from a validation result and stores them in the store configured for this action.

Evaluation parameters allow expectations to refer to statistics/metrics computed in the process of validating other prior expectations.

Configuration

- name: store_evaluation_params
action:
  class_name: StoreEvaluationParametersAction
  # name of the store where the action will store the parameters
  # the name must refer to a store that is configured in the great_expectations.yml file
  target_store_name: evaluation_parameter_store
_run(self, validation_result_suite, validation_result_suite_identifier, data_asset, payload=None)
class great_expectations.validation_operators.StoreMetricsAction(data_context, requested_metrics, target_store_name='metrics_store')

Bases: great_expectations.validation_operators.actions.ValidationAction

StoreMetricsAction extracts metrics from a Validation Result and stores them in a metrics store.

Configuration

- name: store_evaluation_params
action:
  class_name: StoreMetricsAction
  # name of the store where the action will store the metrics
  # the name must refer to a store that is configured in the great_expectations.yml file
  target_store_name: my_metrics_store
_run(self, validation_result_suite, validation_result_suite_identifier, data_asset, payload=None)
class great_expectations.validation_operators.StoreValidationResultAction(data_context, target_store_name=None)

Bases: great_expectations.validation_operators.actions.ValidationAction

StoreValidationResultAction stores a validation result in the ValidationsStore.

Configuration

- name: store_validation_result
action:
  class_name: StoreValidationResultAction
  # name of the store where the actions will store validation results
  # the name must refer to a store that is configured in the great_expectations.yml file
  target_store_name: validations_store
_run(self, validation_result_suite, validation_result_suite_identifier, data_asset, payload=None)
class great_expectations.validation_operators.UpdateDataDocsAction(data_context, site_names=None, target_site_names=None)

Bases: great_expectations.validation_operators.actions.ValidationAction

UpdateDataDocsAction is a validation action that notifies the site builders of all the data docs sites of the Data Context that a validation result should be added to the data docs.

Configuration

- name: update_data_docs
action:
  class_name: UpdateDataDocsAction

You can also instruct UpdateDataDocsAction to build only certain sites by providing a site_names key with a list of sites to update:

  • name: update_data_docs

action:

class_name: UpdateDataDocsAction site_names:

  • production_site

_run(self, validation_result_suite, validation_result_suite_identifier, data_asset, payload=None)
class great_expectations.validation_operators.ValidationAction(data_context)

This is the base class for all actions that act on validation results and are aware of a Data Context namespace structure.

The Data Context is passed to this class in its constructor.

run(self, validation_result_suite, validation_result_suite_identifier, data_asset, **kwargs)
Parameters
  • validation_result_suite

  • validation_result_suite_identifier

  • data_asset

Param

kwargs - any additional arguments the child might use

Returns

_run(self, validation_result_suite, validation_result_suite_identifier, data_asset)
great_expectations.validation_operators.logger
great_expectations.validation_operators.send_slack_notification(query, slack_webhook)
great_expectations.validation_operators.send_opsgenie_alert(query, suite_name, settings)

Creates an alert in Opsgenie.

class great_expectations.validation_operators.ActionListValidationOperator(data_context, action_list, name, result_format={'result_format': 'SUMMARY'})

Bases: great_expectations.validation_operators.validation_operators.ValidationOperator

ActionListValidationOperator validates each batch in its run method’s assets_to_validate argument against the Expectation Suite included within that batch.

Then it invokes a list of configured actions on every validation result.

Each action in the list must be an instance of ValidationAction class (or its descendants). See the actions included in Great Expectations and how to configure them here. You can also implement your own actions by extending the base class.

The init command includes this operator in the default configuration file.

Configuration

An instance of ActionListValidationOperator is included in the default configuration file great_expectations.yml that great_expectations init command creates.

perform_action_list_operator:  # this is the name you will use when you invoke the operator
  class_name: ActionListValidationOperator

  # the operator will call the following actions on each validation result
  # you can remove or add actions to this list. See the details in the actions
  # reference
  action_list:
    - name: store_validation_result
      action:
        class_name: StoreValidationResultAction
        target_store_name: validations_store
    - name: send_slack_notification_on_validation_result
      action:
        class_name: SlackNotificationAction
        # put the actual webhook URL in the uncommitted/config_variables.yml file
        slack_webhook: ${validation_notification_slack_webhook}
        notify_on: all # possible values: "all", "failure", "success"
        notify_with: optional list of DataDocs sites (ie local_site or gcs_site") to include in Slack notification. Will default to including all configured DataDocs sites.
        renderer:
          module_name: great_expectations.render.renderer.slack_renderer
          class_name: SlackRenderer
    - name: update_data_docs
      action:
        class_name: UpdateDataDocsAction

Invocation

This is an example of invoking an instance of a Validation Operator from Python:

results = context.run_validation_operator(
    assets_to_validate=[batch0, batch1, ...],
    run_id=RunIdentifier(**{
      "run_name": "some_string_that_uniquely_identifies_this_run",
      "run_time": "2020-04-29T10:46:03.197008"  # optional run timestamp, defaults to current UTC datetime
    }),  # you may also pass in a dictionary with run_name and run_time keys
    validation_operator_name="operator_instance_name",
)
  • assets_to_validate - an iterable that specifies the data assets that the operator will validate. The members of the list can be either batches or triples that will allow the operator to fetch the batch: (data_asset_name, expectation_suite_name, batch_kwargs) using this method: get_batch()

  • run_id - pipeline run id of type RunIdentifier, consisting of a run_time (always assumed to be UTC time) and run_name string that is meaningful to you and will help you refer to the result of this operation later

  • validation_operator_name you can instances of a class that implements a Validation Operator

The run method returns a ValidationOperatorResult object:

{
    "run_id": {"run_time": "20200527T041833.074212Z", "run_name": "my_run_name"},
    "success": True,
    "evaluation_parameters": None,
    "validation_operator_config": {
        "class_name": "ActionListValidationOperator",
        "module_name": "great_expectations.validation_operators",
        "name": "action_list_operator",
        "kwargs": {
            "action_list": [
                {
                    "name": "store_validation_result",
                    "action": {"class_name": "StoreValidationResultAction"},
                },
                {
                    "name": "store_evaluation_params",
                    "action": {"class_name": "StoreEvaluationParametersAction"},
                },
                {
                    "name": "update_data_docs",
                    "action": {"class_name": "UpdateDataDocsAction"},
                },
            ]
        },
    },
    "run_results": {
        ValidationResultIdentifier: {
            "validation_result": ExpectationSuiteValidationResult object,
            "actions_results": {
                "store_validation_result": {},
                "store_evaluation_params": {},
                "update_data_docs": {},
            },
        }
    },
}
property validation_operator_config(self)

This method builds the config dict of a particular validation operator. The “kwargs” key is what really distinguishes different validation operators.

e.g.: {

“class_name”: “ActionListValidationOperator”, “module_name”: “great_expectations.validation_operators”, “name”: self.name, “kwargs”: {

“action_list”: self.action_list

},

}

{

“class_name”: “WarningAndFailureExpectationSuitesValidationOperator”, “module_name”: “great_expectations.validation_operators”, “name”: self.name, “kwargs”: {

“action_list”: self.action_list, “base_expectation_suite_name”: self.base_expectation_suite_name, “expectation_suite_name_suffixes”: self.expectation_suite_name_suffixes, “stop_on_first_error”: self.stop_on_first_error, “slack_webhook”: self.slack_webhook, “notify_on”: self.notify_on,

},

}

_build_batch_from_item(self, item)
Internal helper method to take an asset to validate, which can be either:
  1. a DataAsset; or

  2. a tuple of data_asset_name, expectation_suite_name, and batch_kwargs (suitable for passing to get_batch)

Parameters

item – The item to convert to a batch (see above)

Returns

A batch of data

run(self, assets_to_validate, run_id=None, evaluation_parameters=None, run_name=None, run_time=None, result_format=None)
_run_actions(self, batch, expectation_suite_identifier, expectation_suite, batch_validation_result, run_id)

Runs all actions configured for this operator on the result of validating one batch against one expectation suite.

If an action fails with an exception, the method does not continue.

Parameters
  • batch

  • expectation_suite

  • batch_validation_result

  • run_id

Returns

a dictionary: {action name -> result returned by the action}

class great_expectations.validation_operators.ValidationOperator

The base class of all validation operators.

It defines the signature of the public run method. This method and the validation_operator_config property are the only contract re operators’ API. Everything else is up to the implementors of validation operator classes that will be the descendants of this base class.

property validation_operator_config(self)

This method builds the config dict of a particular validation operator. The “kwargs” key is what really distinguishes different validation operators.

e.g.: {

“class_name”: “ActionListValidationOperator”, “module_name”: “great_expectations.validation_operators”, “name”: self.name, “kwargs”: {

“action_list”: self.action_list

},

}

{

“class_name”: “WarningAndFailureExpectationSuitesValidationOperator”, “module_name”: “great_expectations.validation_operators”, “name”: self.name, “kwargs”: {

“action_list”: self.action_list, “base_expectation_suite_name”: self.base_expectation_suite_name, “expectation_suite_name_suffixes”: self.expectation_suite_name_suffixes, “stop_on_first_error”: self.stop_on_first_error, “slack_webhook”: self.slack_webhook, “notify_on”: self.notify_on,

},

}

abstract run(self, assets_to_validate, run_id=None, evaluation_parameters=None, run_name=None, run_time=None)
class great_expectations.validation_operators.WarningAndFailureExpectationSuitesValidationOperator(data_context, action_list, name, base_expectation_suite_name=None, expectation_suite_name_suffixes=None, stop_on_first_error=False, slack_webhook=None, notify_on='all', notify_with=None, result_format={'result_format': 'SUMMARY'})

Bases: great_expectations.validation_operators.validation_operators.ActionListValidationOperator

WarningAndFailureExpectationSuitesValidationOperator is a validation operator that accepts a list batches of data assets (or the information necessary to fetch these batches). The operator retrieves 2 expectation suites for each data asset/batch - one containing the critical expectations (“failure”) and the other containing non-critical expectations (“warning”). By default, the operator assumes that the first is called “failure” and the second is called “warning”, but “base_expectation_suite_name” attribute can be specified in the operator’s configuration to make sure it searched for “{base_expectation_suite_name}.failure” and {base_expectation_suite_name}.warning” expectation suites for each data asset.

The operator validates each batch against its “failure” and “warning” expectation suites and invokes a list of actions on every validation result.

The list of these actions is specified in the operator’s configuration

Each action in the list must be an instance of ValidationAction class (or its descendants).

The operator sends a Slack notification (if “slack_webhook” is present in its config). The “notify_on” config property controls whether the notification should be sent only in the case of failure (“failure”), only in the case of success (“success”), or always (“all”).

Configuration

Below is an example of this operator’s configuration:

run_warning_and_failure_expectation_suites:
    class_name: WarningAndFailureExpectationSuitesValidationOperator

    # the following two properties are optional - by default the operator looks for
    # expectation suites named "failure" and "warning".
    # You can use these two properties to override these names.
    # e.g., with expectation_suite_name_prefix=boo_ and
    # expectation_suite_name_suffixes = ["red", "green"], the operator
    # will look for expectation suites named "boo_red" and "boo_green"
    expectation_suite_name_prefix="",
    expectation_suite_name_suffixes=["failure", "warning"],

    # optional - if true, the operator will stop and exit after first failed validation. false by default.
    stop_on_first_error=False,

    # put the actual webhook URL in the uncommitted/config_variables.yml file
    slack_webhook: ${validation_notification_slack_webhook}
    # optional - if "all" - notify always, "success" - notify only on success, "failure" - notify only on failure
    notify_on="all"

    # the operator will call the following actions on each validation result
    # you can remove or add actions to this list. See the details in the actions
    # reference
    action_list:
      - name: store_validation_result
        action:
          class_name: StoreValidationResultAction
          target_store_name: validations_store
      - name: store_evaluation_params
        action:
          class_name: StoreEvaluationParametersAction
          target_store_name: evaluation_parameter_store

Invocation

This is an example of invoking an instance of a Validation Operator from Python:

results = context.run_validation_operator(
    assets_to_validate=[batch0, batch1, ...],
    run_id=RunIdentifier(**{
      "run_name": "some_string_that_uniquely_identifies_this_run",
      "run_time": "2020-04-29T10:46:03.197008"  # optional run timestamp, defaults to current UTC datetime
    }),  # you may also pass in a dictionary with run_name and run_time keys
    validation_operator_name="operator_instance_name",
)
  • assets_to_validate - an iterable that specifies the data assets that the operator will validate. The members of the list can be either batches or triples that will allow the operator to fetch the batch: (data_asset_name, expectation_suite_name, batch_kwargs) using this method: get_batch()

  • run_id - pipeline run id of type RunIdentifier, consisting of a run_time (always assumed to be UTC time) and run_name string that is meaningful to you and will help you refer to the result of this operation later

  • validation_operator_name you can instances of a class that implements a Validation Operator

The run method returns a ValidationOperatorResult object.

The value of “success” is True if no critical expectation suites (“failure”) failed to validate (non-critical warning”) expectation suites are allowed to fail without affecting the success status of the run.

{
    "run_id": {"run_time": "20200527T041833.074212Z", "run_name": "my_run_name"},
    "success": True,
    "evaluation_parameters": None,
    "validation_operator_config": {
        "class_name": "WarningAndFailureExpectationSuitesValidationOperator",
        "module_name": "great_expectations.validation_operators",
        "name": "warning_and_failure_operator",
        "kwargs": {
            "action_list": [
                {
                    "name": "store_validation_result",
                    "action": {"class_name": "StoreValidationResultAction"},
                },
                {
                    "name": "store_evaluation_params",
                    "action": {"class_name": "StoreEvaluationParametersAction"},
                },
                {
                    "name": "update_data_docs",
                    "action": {"class_name": "UpdateDataDocsAction"},
                },
            ],
            "base_expectation_suite_name": ...,
            "expectation_suite_name_suffixes": ...,
            "stop_on_first_error": ...,
            "slack_webhook": ...,
            "notify_on": ...,
            "notify_with":...,
        },
    },
    "run_results": {
        ValidationResultIdentifier: {
            "validation_result": ExpectationSuiteValidationResult object,
            "expectation_suite_severity_level": "warning",
            "actions_results": {
                "store_validation_result": {},
                "store_evaluation_params": {},
                "update_data_docs": {},
            },
        }
    }
}
property validation_operator_config(self)

This method builds the config dict of a particular validation operator. The “kwargs” key is what really distinguishes different validation operators.

e.g.: {

“class_name”: “ActionListValidationOperator”, “module_name”: “great_expectations.validation_operators”, “name”: self.name, “kwargs”: {

“action_list”: self.action_list

},

}

{

“class_name”: “WarningAndFailureExpectationSuitesValidationOperator”, “module_name”: “great_expectations.validation_operators”, “name”: self.name, “kwargs”: {

“action_list”: self.action_list, “base_expectation_suite_name”: self.base_expectation_suite_name, “expectation_suite_name_suffixes”: self.expectation_suite_name_suffixes, “stop_on_first_error”: self.stop_on_first_error, “slack_webhook”: self.slack_webhook, “notify_on”: self.notify_on,

},

}

_build_slack_query(self, validation_operator_result: ValidationOperatorResult)
run(self, assets_to_validate, run_id=None, base_expectation_suite_name=None, evaluation_parameters=None, run_name=None, run_time=None, result_format=None)