Standard arguments for expectations

All Expectations return a json-serializable dictionary when evaluated, and share four standard (optional) arguments:

  • result_format: controls what information is returned from the evaluation of the expectation expectation.

  • include_config: If true, then the expectation suite itself is returned as part of the result object.

  • catch_exceptions: If true, execution will not fail if the Expectation encounters an error. Instead, it will return success = False and provide an informative error message.

  • meta: allows user-supplied meta-data to be stored with an expectation.

result_format

See result_format for more information.

include_config

All Expectations accept a boolean include_config parameter. If true, then the expectation suite itself is returned as part of the result object

>> expect_column_values_to_be_in_set(
    "my_var",
    ['B', 'C', 'D', 'F', 'G', 'H'],
    result_format="COMPLETE",
    include_config=True,
)

{
    'exception_index_list': [0, 10, 11, 12, 13, 14],
    'exception_list': ['A', 'E', 'E', 'E', 'E', 'E'],
    'expectation_type': 'expect_column_values_to_be_in_set',
    'expectation_kwargs': {
        'column': 'my_var',
        'result_format': 'COMPLETE',
        'value_set': ['B', 'C', 'D', 'F', 'G', 'H']
    },
    'success': False
}

catch_exceptions

All Expectations accept a boolean catch_exceptions parameter. If true, execution will not fail if the Expectation encounters an error. Instead, it will return False and (in BASIC and SUMMARY modes) an informative error message

{
    "result": False,
    "raised_exception": True,
    "exception_traceback": "..."
}

catch_exceptions is on by default in command-line validation mode, and off by default in exploration mode.

meta

All Expectations accept an optional meta parameter. If meta is a valid JSON-serializable dictionary, it will be passed through to the expectation_result object without modification.

>> my_df.expect_column_values_to_be_in_set(
    "my_column",
    ["a", "b", "c"],
    meta={
        "foo": "bar",
        "baz": [1,2,3,4]
    }
)
{
    "success": False,
    "meta": {
        "foo": "bar",
        "baz": [1,2,3,4]
    }
}

mostly

mostly is a special argument that is automatically available in all column_map_expectations. mostly must be a float between 0 and 1. Great Expectations evaluates it as a percentage, allowing some wiggle room when evaluating expectations: as long as mostly percent of rows evaluate to True, the expectation returns “success”: True.

[0,1,2,3,4,5,6,7,8,9]

>> my_df.expect_column_values_to_be_between(
    "my_column",
    min_value=0,
    max_value=7
)
{
    "success": False,
    ...
}

>> my_df.expect_column_values_to_be_between(
    "my_column",
    min_value=0,
    max_value=7,
    mostly=0.7
)
{
    "success": True,
    ...
}

Expectations with mostly return exception lists even if they succeed:

>> my_df.expect_column_values_to_be_between(
    "my_column",
    min_value=0,
    max_value=7,
    mostly=0.7
)
{
  "success": true
  "result": {
    "unexpected_percent": 0.2,
    "partial_unexpected_index_list": [
      8,
      9
    ],
    "partial_unexpected_list": [
      8,
      9
    ],
    "unexpected_percent_nonmissing": 0.2,
    "unexpected_count": 2
  }
}

Dataset defaults

This default behavior for result_format, include_config, catch_exceptions can be overridden at the Dataset level:

my_dataset.set_default_expectation_argument("result_format", "SUMMARY")

In validation mode, they can be overridden using flags:

great_expectations my_dataset.csv my_expectations.json --result_format=BOOLEAN_ONLY --catch_exceptions=False --include_config=True