Skip to main content
Version: 1.3.3

Choose result format

When you validate data with GX Core you can set the level of detail returned in your Validation Results by specifying a value for the optional result_format parameter. These settings will be applied to the results returned by each validated Expectation.

Typical use cases for customizing Result Format settings include summarizing values that cause Expectations to fail during data exploration, retrieving failed rows to facilitate cleaning data, or excluding excess Validation Result data in published Data Docs.

Define a Result Format configuration

The result_format parameter takes in a dictionary of configuration settings.

  1. Create a dictionary and set the verbosity of returned Validation Results.

    The verbosity of your Validation Results can be set as the value of the key "result_format" in your Result Format dictionary. In order from least verbosity to greatest detail, the valid values for the "result_format" key are:

    • "BOOLEAN_ONLY"
    • "BASIC"
    • "SUMMARY"
    • "COMPLETE".

    The default verbosity level of Validation Results generated by Expectations is "SUMMARY".

    Select a value below to see example code for that Result Format and what information is returned at that level of verbosity:

    When the result_format is set to "BASIC" the Validation Results of each Expectation includes a result dictionary with information providing a basic explanation for why it failed or succeeded. The format is intended for quick feedback and it works well in Jupyter Notebooks.

    You can check the Validation Results reference tables to see what information is provided in the result dictionary.

    To create a "BASIC" result format configuration use the following code:

    Python
    basic_result_format_dict = {"result_format": "BASIC"}
  2. Optional. Specify configurations for additional settings available to the base result_format.

    Once you have defined the base configuration in your result_format key, you can further tailor the format of your Validation Results by defining additional key/value pairs in your Result Format dictionary.

    Reference the table below for valid keys and how they influence the format of generated Validation Results:

    Dictionary keyPurpose
    "unexpected_index_column_names"Defines the columns that can be used to identify unexpected results. For example, primary key (PK) column(s) or other columns with unique identifiers. Supports multiple column names as a list.
    "return_unexpected_index_query"When running validations, a query (or a set of indices) is returned that allows you to retrieve the full set of unexpected results as well as the values of the identifying columns specified in "unexpected_index_column_names". Setting this value to False suppresses the output (default is True).
    "partial_unexpected_count"Sets the number of results to include in "partial_unexpected_list". Set the value to zero to suppress the unexpected counts.
  3. Apply the Result Format to a Checkpoint, Validation Definition, or Batch.

    You can define a persisting Result Format configuration by passing it in as the result_format parameter when a Checkpoint is created. The Result Format will be applied every time the Checkpoint is run. For more information on creating a Checkpoint see Create a Checkpoint with Actions.

    You can also pass a result_format configuration at runtime to the .run(...) method of a Validation Definition or to the .validate(...) method of a Batch. This result_format configuration does not persist with the Validation Definition or Batch and will apply to only the current execution of the .run(...) or .validate(...) method. For more information see Run a Validation Definition or Test an Expectation.

Validation Results reference tables

The following table lists the fields that can be found in the result dictionary of a Validation Result, and what information is provided by that field.

Field within resultValue
element_countThe total number of values in the column.
missing_countThe number of missing values in the column.
missing_percentThe total percent of rows missing values for the column.
unexpected_countThe total count of unexpected values in in a column.
unexpected_percentThe overall percent of unexpected values in a column.
unexpected_percent_nonmissingThe percent of unexpected values in a column, excluding rows that have no value for that column.
observed_valueThe aggregate statistic computed for the column. This only applies to Expectations that pertain to the aggregate value of a column, rather than the individual values in each row for the column.
partial_unexpected_listA partial list of values that violate the Expectation. (Up to 20 values by default.)
partial_unexpected_index_listA partial list the unexpected values in the column, as defined by the columns in unexpected_index_column_names. (Up to 20 indecies by default.)
partial_unexpected_countsA partial list of values and counts, showing the number of times each of the unexpected values occur. (Up to 20 unexpected value/count pairs by default.)
unexpected_index_listA list of the indices of the unexpected values in the column, as defined by the columns in unexpected_index_column_names. This only applies to Expectations that have a yes/no answer for each row.
unexpected_index_queryA query that can be used to retrieve all unexpected values (SQL and Spark), or the full list of unexpected indices (Pandas). This only applies to Expectations that have a yes/no answer for each row.
unexpected_listA list of up to 200 values that violate the Expectation.