Format results
You can control the level of detail GX Cloud returns in your Validation Results to improve the clarity and efficiency of your data quality workflows. You can format your results to receive only the information you need, whether that’s a high-level pass/fail indicator for exploration, specific failing values for troubleshooting, or full failed rows for data cleansing.
Depending on your use case, you can format your Validation Results with either the GX Cloud UI or the GX Cloud API.
- To format results from GX-managed Expectations, you can use either the UI or the API.
- To format results from API-managed Expectations, you must use the API.
- The UI provides a limited set of options for common combinations of more granular settings available through the API.
- The API gives you full control to make custom combinations of settings.
No matter which interface you use to format your Validation Results, the configuration impacts the results you receive from both the GX Cloud UI and the GX Cloud API.
- UI
- API
Prerequisites
- A GX Cloud account with Workspace Editor permissions or greater.
- A Data Asset with GX-managed Expectations.
Configure Validation Results
-
In the GX Cloud UI, select the relevant Workspace and then click Data Assets.
-
In the Data Assets list, click the Data Asset name.
-
Click Settings.
-
Choose what to include in your Validation Results. Here’s what is provided by each option:
Status Observed values (default) Sample unexpected rows Success or failure ✅ ✅ ✅ Query to retrieve full unexpected results * ✅ ✅ ✅ Success rate * ❌ ✅ ✅ Observed computed values * ❌ ✅ ✅ Number of missing or unexpected rows * ❌ ✅ ✅ Up to 25 sample unexpected values * ❌ ✅ ❌ Up to 25 sample unexpected rows * ❌ ❌ ✅ * Note that this kind of detail is not returned by some types of Expectations, even if this kind of detail is generally supported in your selected configuration. For example, a Column Aggregate Expectation like ExpectColumnMeanToBeBetween will never return a sample of failed rows because it assesses an aggregate of values across rows.
-
Click OK to save your selection. The new selection applies going forward. Historical Validation Results retain their original contents.
For more information about how the opinionated options in the UI map to the more granular options in the API, see the UI options reference table.
Prerequisites
- A GX Cloud account with Workspace Editor permissions or greater.
- Your Cloud credentials saved in your environment variables.
- A Data Asset with a Checkpoint or Validation Definition. You can use an automatically created GX-managed resource or a manually created resource.
- Python version 3.10 to 3.13.
- An installation of the Great Expectations Python library.
Configure and apply a Result Format
Follow the steps below to select a base format, optionally configure additional settings available to your selection, and then apply the Result Format configuration to a Checkpoint or Validation Definition.
-
Create a dictionary and set the base format of your Validation Results as the value of the key
"result_format". In order from least to most detail, the valid values for the"result_format"key are:"BOOLEAN_ONLY""BASIC""SUMMARY""COMPLETE"
The default for Validation Results generated by GX-managed Checkpoints is
"BASIC"with some non-default additional settings. The default for Validation Results generated by Validation Definitions and API-managed Checkpoints is"SUMMARY".Select a value below to see example code for that Result Format and what information is returned at that level:
- "BOOLEAN_ONLY"
- "BASIC"
- "SUMMARY"
- "COMPLETE"
When the
result_formatis"BOOLEAN_ONLY", Validation Results by default do not include additional information in aresultdictionary. The successful evaluation of the Expectation is exclusively returned via theTrueorFalsevalue of thesuccesskey in the returned Validation Result.To create a
"BOOLEAN_ONLY"Result Format configuration, use the following code:Pythonboolean_result_format_dict = {"result_format": "BOOLEAN_ONLY"}When the
result_formatis set to"BASIC", the Validation Results of each Expectation include aresultdictionary with information providing a basic explanation for why it failed or succeeded. The format is intended for quick feedback and it works well in Jupyter Notebooks.You can check the result field reference table to see what information is provided in the
resultdictionary.To create a
"BASIC"Result Format configuration, use the following code:Pythonbasic_result_format_dict = {"result_format": "BASIC"}When the
result_formatkey is set to"SUMMARY", the Validation Results of each Expectation include aresultdictionary with information that summarizes values to show why it failed or succeeded. This format is intended for more detailed exploratory work and includes additional information beyond what is included byBASIC.You can check the result field reference table to see what information is provided in the
resultdictionary.To create a
"SUMMARY"Result Format configuration, use the following code:Pythonsummary_result_format_dict = {"result_format": "SUMMARY"}When the
result_formatkey is set to"COMPLETE", the Validation Results of each Expectation include aresultdictionary with all available information to explain why it failed or succeeded. This format is intended for debugging pipelines or developing detailed regression tests and includes additional information beyond what is provided by"SUMMARY".You can check the result field reference table to see what information is provided in the
resultdictionary.To create a
"COMPLETE"Result Format configuration, use the following code:Pythoncomplete_result_format_dict = {"result_format": "COMPLETE"} -
Optional. Specify configurations for additional settings available to the base
result_format.Once you have defined the base configuration in your
result_formatkey, you can further tailor the format of your Validation Results by defining additional key/value pairs in your Result Format dictionary.Reference the table below for valid keys and how they influence the format of generated Validation Results:
- "BOOLEAN_ONLY"
- "BASIC"
- "SUMMARY"
- "COMPLETE"
Dictionary key Purpose "return_unexpected_index_query"Return a query (or a set of indices) that allows you to retrieve the full set of unexpected results as well as the values of any identifying columns specified in "unexpected_index_column_names". (Default isFalse)."unexpected_index_column_names"Takes a list to define the column(s) that will be used to identify unexpected results returned. For example, primary key (PK) column(s) or other columns with unique identifiers. Dictionary key Purpose "include_unexpected_rows"When True, GX Cloud returns up to 200 entire rows that violate the Expectation (default isFalse). Applies to Column Map, Column Pair Map, Multicolumn Map, and Unexpected Rows Expectations only. Note thatExpectColumnValuesToBeOfTypeandExpectColumnValuesToBeInTypeListwill return unexpected rows for only Pandas Data Sources."partial_unexpected_count"Sets the number of results to include in "partial_unexpected_list"and"unexpected_rows"(default is 20). Set the value to zero to suppress the"partial_unexpected_list"output."return_unexpected_index_query"Return a query (or a set of indices) that allows you to retrieve the full set of unexpected results as well as the values of any identifying columns specified in "unexpected_index_column_names". (Default isFalse)."unexpected_index_column_names"Takes a list to define the column(s) that will be used to identify unexpected results returned. For example, primary key (PK) column(s) or other columns with unique identifiers. Dictionary key Purpose "include_unexpected_rows"When True, GX Cloud returns up to 200 entire rows that violate the Expectation (default isFalse). Applies to Column Map, Column Pair Map, Multicolumn Map, and Unexpected Rows Expectations only. Note thatExpectColumnValuesToBeOfTypeandExpectColumnValuesToBeInTypeListwill return unexpected rows for only Pandas Data Sources."partial_unexpected_count"Sets the number of results to include in "partial_unexpected_counts","partial_unexpected_list","partial_unexpected_index_list", and"unexpected_rows"(default is 20). Set the value to zero to suppress thepartial_unexpected_*output."return_unexpected_index_query"Return a query (or a set of indices) that allows you to retrieve the full set of unexpected results as well as the values of any identifying columns specified in "unexpected_index_column_names". (Default isFalse)."unexpected_index_column_names"Takes a list to define the column(s) that will be used to identify unexpected results returned. For example, primary key (PK) column(s) or other columns with unique identifiers. Dictionary key Purpose "exclude_unexpected_values"When running validations, a set of unexpected results' indices and values is returned. Setting this value to Truesuppresses values from the output to only have indices (default isFalse)."include_unexpected_rows"When True, GX Cloud returns up to 200 entire rows that violate the Expectation (default isFalse). Applies to Column Map, Column Pair Map, Multicolumn Map, and Unexpected Rows Expectations only. Note thatExpectColumnValuesToBeOfTypeandExpectColumnValuesToBeInTypeListwill return unexpected rows for only Pandas Data Sources."partial_unexpected_count"Sets the number of results to include in "partial_unexpected_counts","partial_unexpected_list","partial_unexpected_index_list", and"unexpected_rows"(default is 20). Set the value to zero to suppress thepartial_unexpected_*output."return_unexpected_index_query"Return a query (or a set of indices) that allows you to retrieve the full set of unexpected results as well as the values of any identifying columns specified in "unexpected_index_column_names". Setting"return_unexpected_index_query"toFalsesuppresses the output (default isTrue)."unexpected_index_column_names"Takes a list to define the column(s) that will be used to identify unexpected results returned. For example, primary key (PK) column(s) or other columns with unique identifiers. -
Apply the Result Format to a Checkpoint or Validation Definition.
You can define a persistent Result Format configuration on a Checkpoint. The Result Format will be applied every time the Checkpoint is run. For more information on retrieving or creating a Checkpoint, see Run a Validation.
Saved Result Formatimport great_expectations as gx
context = gx.get_context(mode="cloud")
# Define the Result Format
result_format_dict = {
"result_format": "COMPLETE",
"unexpected_index_column_names": ["my_indentifying_column"],
"partial_unexpected_count": 25,
"include_unexpected_rows": True,
}
# Retrieve the Checkpoint
checkpoint = context.checkpoints.get("my_checkpoint")
# Update the Checkpoint's configuration
checkpoint.result_format = result_format_dict
checkpoint.save()
# Run the Checkpoint
# If you are working with a SQL or filesystem Data Asset, omit the batch_parameters.
batch_parameters = {"dataframe": test_df}
checkpoint.run(batch_parameters=batch_parameters)Alternatively, you can pass a
result_formatconfiguration at runtime to the.run(...)method of a Validation Definition. Thisresult_formatconfiguration does not persist with the Validation Definition; it will apply to only the current execution of the.run(...)method. For more information on creating a Validation Definition, see Run a Validation.Runtime Result Formatimport great_expectations as gx
context = gx.get_context(mode="cloud")
# Define the Result Format
result_format_dict = {
"result_format": "COMPLETE",
"unexpected_index_column_names": ["my_indentifying_column"],
"partial_unexpected_count": 25,
"include_unexpected_rows": True,
}
# Retrieve the Validation Definition
validation_definition = context.validation_definitions.get("my_validation_definition")
# Run the Validation Definition with a Result Format configuration
# If you are working with a SQL or filesystem Data Asset, omit the batch_parameters.
batch_parameters = {"dataframe": test_df}
validation_results = validation_definition.run(
result_format=result_format_dict, batch_parameters=batch_parameters
)
# Review the Validation Results
print(validation_results)
Reference tables
- Information in result fields
- Result fields by base format
- Result Format keys
- UI options
The following table lists the fields that can be found in the result dictionary of a Validation Result, and what information is provided by that field.
Field within result | Value |
|---|---|
| element_count | The total number of values in the column. |
| missing_count | The number of missing values in the column. |
| missing_percent | The total percent of rows missing values for the column. |
| unexpected_count | The total count of unexpected values in a column. |
| unexpected_percent | The overall percent of unexpected values in a column. |
| unexpected_percent_nonmissing | The percent of unexpected values in a column, excluding rows that have no value for that column. |
| observed_value | The aggregate statistic computed for the column. This only applies to Expectations that pertain to the aggregate value of a column, rather than the individual values in each row for the column. |
| partial_unexpected_list | A partial list of values that violate the Expectation. (Up to 20 values by default.) |
| partial_unexpected_index_list | A partial list of the unexpected values in the column, as defined by the columns in unexpected_index_column_names. (Up to 20 indices by default.) |
| partial_unexpected_counts | A partial list of values and counts, showing the number of times each of the unexpected values occurs. (Up to 20 unexpected value/count pairs by default.) |
| unexpected_index_list | A list of the indices of the unexpected values in the column, as defined by the columns in unexpected_index_column_names. This only applies to Expectations that have a yes/no answer for each row. |
| unexpected_index_query | A query that can be used to retrieve all unexpected values (SQL and Spark), or the full list of unexpected indices (Pandas). This only applies to Expectations that have a yes/no answer for each row. |
| unexpected_list | A list of up to 200 values that violate the Expectation. |
| unexpected_rows | Up to 200 complete rows that violate the Expectation. The format depends on the Data Source. For example, a SQL Data Source will return a list of dictionaries while a Spark Data Source will return a DataFrame. Applies to Column Map, Column Pair Map, Multicolumn Map, and Unexpected Rows Expectations only. Note that ExpectColumnValuesToBeOfType and ExpectColumnValuesToBeInTypeList will return unexpected rows for only Pandas Data Sources. |
The following table lists the fields that can be found in the result dictionary of a Validation Result and the result_format levels that return that field. An * indicates the field is not returned by default but can be enabled through an additional setting. Meanwhile, ** indicates that the field is returned by default but can be disabled.
Fields within result | BOOLEAN_ONLY | BASIC | SUMMARY | COMPLETE |
|---|---|---|---|---|
| element_count | no | yes | yes | yes |
| missing_count | no | yes | yes | yes |
| missing_percent | no | yes | yes | yes |
| unexpected_count | no | yes | yes | yes |
| unexpected_percent | no | yes | yes | yes |
| unexpected_percent_nonmissing | no | yes | yes | yes |
| observed_value | no | yes | yes | yes |
| partial_unexpected_list | no | yes ** | yes ** | yes ** |
| partial_unexpected_index_list | no | no | yes ** | yes ** |
| partial_unexpected_counts | no | no | yes ** | yes ** |
| unexpected_index_list | no | no | no | yes |
| unexpected_index_query | yes * | yes * | yes * | yes |
| unexpected_list | no | no | no | yes |
| unexpected_rows | no | yes * | yes * | yes * |
The following table lists the valid keys for a Result Format dictionary and what their purpose is. Not all keys are used by every result_format level.
| Dictionary key | Purpose |
|---|---|
"result_format" | Sets the fields to return in Validation Results. Valid values are "BASIC", "BOOLEAN_ONLY", "COMPLETE", and "SUMMARY" (default for GX-managed Checkpoints is "BASIC" with some non-default additional settings; default for Validation Definitions and API-managed Checkpoints is "SUMMARY"). |
"unexpected_index_column_names" | Takes a list to define the column(s) that will be used to identify unexpected results returned. For example, primary key (PK) column(s) or other columns with unique identifiers. |
"return_unexpected_index_query" | Return a query (or a set of indices) that allows you to retrieve the full set of unexpected results as well as the values of any identifying columns specified in "unexpected_index_column_names". Setting "return_unexpected_index_query" to False suppresses the output (default is True for "COMPLETE" and False for "BASIC", "BOOLEAN_ONLY", and "SUMMARY"). |
"partial_unexpected_count" | Sets the number of results to include in "partial_unexpected_counts", "partial_unexpected_list", "partial_unexpected_index_list", and "unexpected_rows" if applicable (default is 20). Set the value to zero to suppress the partial_unexpected_* output. |
"exclude_unexpected_values" | When running validations, a set of unexpected results' indices and values is returned. Setting this value to True suppresses values from the output to only have indices (default is False). |
"include_unexpected_rows" | When True, GX Cloud returns up to 200 entire rows that violate the Expectation (default is False). Applies to Column Map, Column Pair Map, Multicolumn Map, and Unexpected Rows Expectations only. Note that ExpectColumnValuesToBeOfType and ExpectColumnValuesToBeInTypeList will return unexpected rows for only Pandas Data Sources. |
In case you want to replicate one of the opinionated UI options for configuring Validation Results, here are the equivalent API configurations for each UI option.
| UI option | API configuration |
|---|---|
| Status | "result_format": "BOOLEAN_ONLY","return_unexpected_index_query": True, |
| Observed values | "result_format": "BASIC","return_unexpected_index_query": True,"partial_unexpected_count": 25, |
| Sample unexpected rows | "result_format": "COMPLETE","partial_unexpected_count": 25,"include_unexpected_rows": True, |