Skip to main content
Version: 1.15.1

Retrieve all unexpected rows

By default, Validation Results summarize why Expectations failed or succeeded. The default verbosity is designed for exploratory work. If you need more details, you can increase the verbosity to return up to 200 failing rows for certain types of Expectations. While this is sufficient for many troubleshooting workflows, there may be times when you want to retrieve all the failing rows with no limit. For example, if you want to quarantine bad records, you'll need all the failing rows. To support these use cases, GX Core provides the ValidationDefinition.get_unexpected_rows() method for use with the UnexpectedRowsExpectation class. You can use this method to return all rows that failed your custom SQL Expectations in the batch of data you validated.

Prerequisites

Procedure

  1. Retrieve your Validation Definition:

    You can use a Checkpoint instead.

    While this example shows how to retrieve all unexpected rows after running a Validation Definition, you can use the ValidationDefinition.get_unexpected_rows() method after running a Checkpoint.

    Python
    validation_definition_name = "my_validation_definition"
    validation_definition = context.validation_definitions.get(validation_definition_name)
  2. Run the Validation Definition to get a result.

    If your Batch Definition is partitioned, pass the appropriate batch_parameters:

    Python
    result = validation_definition.run()
  3. Iterate over the results and call get_unexpected_rows() for each failing custom SQL Expectation.

    The get_unexpected_rows() method only supports UnexpectedRowsExpectation. If your Expectation Suite contains other Expectation types, check isinstance(evr.expectation, UnexpectedRowsExpectation) before calling the method. If your Batch Definition uses partitioning, pass result.batch_parameters:

    Python
    for evr in result.results:
    # Filter by status and type because get_unexpected_rows() supports only UnexpectedRowsExpectation
    if not evr.success and isinstance(evr.expectation, UnexpectedRowsExpectation):
    unexpected_rows = validation_definition.get_unexpected_rows(
    evr.expectation,
    batch_parameters=result.batch_parameters,
    )
    print(f"{len(unexpected_rows)} unexpected rows found")

    The get_unexpected_rows() method returns a list[dict] with one dictionary per failing row. You can convert this to a DataFrame, write it to a quarantine table, or process it however you need.

    Runtime parameters

    If your Expectation uses the $PARAMETER syntax for runtime parameters, pass your dictionary of parameter values with the expectation_parameters argument. Otherwise, you will get a ValueError.