Skip to main content
Version: 1.9.3

Format results

You can control the level of detail GX Cloud returns in your Validation Results to improve the clarity and efficiency of your data quality workflows. By configuring the result_format setting with the GX Cloud API, you can receive only the information you need, whether that’s a high-level pass/fail indicator for exploration, specific failing values for troubleshooting, or full failed rows for data cleansing.

This setting controls the results you receive in both the GX Cloud UI and the GX Cloud API, as detailed below. However, result_format must be configured through the GX Cloud API.

Prerequisites

Configure and apply a Result Format

Follow the steps below to select a base level of verbosity, optionally configure additional settings available to your selection, and then apply the Result Format configuration to a Checkpoint or Validation Definition.

  1. Create a dictionary and set the verbosity of your Validation Results as the value of the key "result_format". In order from least verbosity to greatest detail, the valid values for the "result_format" key are:

    • "BOOLEAN_ONLY"
    • "BASIC"
    • "SUMMARY"
    • "COMPLETE"

    The default for Validation Results generated by GX-managed Checkpoints is "COMPLETE". The default for Validation Results generated by Validation Definitions and API-managed Checkpoints is "SUMMARY".

    Select a value below to see example code for that Result Format and what information is returned at that level of verbosity:

    When the result_format is set to "BASIC", the Validation Results of each Expectation include a result dictionary with information providing a basic explanation for why it failed or succeeded. The format is intended for quick feedback and it works well in Jupyter Notebooks.

    You can check the result field reference table to see what information is provided in the result dictionary.

    To create a "BASIC" Result Format configuration, use the following code:

    Python
    basic_result_format_dict = {"result_format": "BASIC"}
  2. Optional. Specify configurations for additional settings available to the base result_format.

    Once you have defined the base configuration in your result_format key, you can further tailor the format of your Validation Results by defining additional key/value pairs in your Result Format dictionary.

    Reference the table below for valid keys and how they influence the format of generated Validation Results:

    Dictionary keyPurpose
    "partial_unexpected_count"Sets the number of results to include in "partial_unexpected_list" (default is 20). Set the value to zero to suppress the unexpected counts.
    "include_unexpected_rows"When True, the GX Cloud API returns up to 200 entire rows that violate the Expectation (default is False). Applies to Column Map Expectations only, such as ExpectColumnValuesToBeInSet. Note that ExpectColumnValuesToBeOfType and ExpectColumnValuesToBeInTypeList will return unexpected rows for only Pandas Data Sources.
  3. Apply the Result Format to a Checkpoint or Validation Definition.

    You can define a persistent Result Format configuration on a Checkpoint. The Result Format will be applied every time the Checkpoint is run. For more information on retrieving or creating a Checkpoint, see Run a Validation.

    Saved Result Format
    import great_expectations as gx

    context = gx.get_context(mode="cloud")

    # Define the Result Format
    result_format_dict = {
    "result_format": "COMPLETE",
    "unexpected_index_column_names": ["my_indentifying_column"],
    "partial_unexpected_count": 25,
    "include_unexpected_rows": True,
    }

    # Retrieve the Checkpoint
    checkpoint = context.checkpoints.get("my_checkpoint")

    # Update the Checkpoint's configuration
    checkpoint.result_format = result_format_dict
    checkpoint.save()

    # Run the Checkpoint
    # If you are working with a SQL or filesystem Data Asset, omit the batch_parameters.
    batch_parameters = {"dataframe": test_df}
    checkpoint.run(batch_parameters=batch_parameters)

    Alternatively, you can pass a result_format configuration at runtime to the .run(...) method of a Validation Definition. This result_format configuration does not persist with the Validation Definition; it will apply to only the current execution of the .run(...) method. For more information on creating a Validation Definition, see Run a Validation.

    Runtime Result Format
    import great_expectations as gx

    context = gx.get_context(mode="cloud")

    # Define the Result Format
    result_format_dict = {
    "result_format": "COMPLETE",
    "unexpected_index_column_names": ["my_indentifying_column"],
    "partial_unexpected_count": 25,
    "include_unexpected_rows": True,
    }

    # Retrieve the Validation Definition
    validation_definition = context.validation_definitions.get("my_validation_definition")

    # Run the Validation Definition with a Result Format configuration
    # If you are working with a SQL or filesystem Data Asset, omit the batch_parameters.
    batch_parameters = {"dataframe": test_df}
    validation_results = validation_definition.run(
    result_format=result_format_dict, batch_parameters=batch_parameters
    )

    # Review the Validation Results
    print(validation_results)

Reference tables

The following table lists the fields that can be found in the result dictionary of a Validation Result, and what information is provided by that field.

Field within resultValue
element_countThe total number of values in the column.
missing_countThe number of missing values in the column.
missing_percentThe total percent of rows missing values for the column.
unexpected_countThe total count of unexpected values in a column.
unexpected_percentThe overall percent of unexpected values in a column.
unexpected_percent_nonmissingThe percent of unexpected values in a column, excluding rows that have no value for that column.
observed_valueThe aggregate statistic computed for the column. This only applies to Expectations that pertain to the aggregate value of a column, rather than the individual values in each row for the column.
partial_unexpected_listA partial list of values that violate the Expectation. (Up to 20 values by default.)
partial_unexpected_index_listA partial list of the unexpected values in the column, as defined by the columns in unexpected_index_column_names. (Up to 20 indices by default.)
partial_unexpected_countsA partial list of values and counts, showing the number of times each of the unexpected values occurs. (Up to 20 unexpected value/count pairs by default.)
unexpected_index_listA list of the indices of the unexpected values in the column, as defined by the columns in unexpected_index_column_names. This only applies to Expectations that have a yes/no answer for each row.
unexpected_index_queryA query that can be used to retrieve all unexpected values (SQL and Spark), or the full list of unexpected indices (Pandas). This only applies to Expectations that have a yes/no answer for each row.
unexpected_listA list of up to 200 values that violate the Expectation.
unexpected_rowsUp to 200 complete rows that violate the Expectation. The format depends on the Data Source. For example, a SQL Data Source will return a list of tuples while a Spark Data Source will return a DataFrame. Not available in the GX Cloud UI. Applies to Column Map Expectations only, such as ExpectColumnValuesToBeInSet. Note that ExpectColumnValuesToBeOfType and ExpectColumnValuesToBeInTypeList will return unexpected_rows for only Pandas Data Sources.