Skip to main content
Version: 1.3.2

Use SQL to define a custom Expectation

Among the available Expectations, the UnexpectedRowsExpectation is designed to facilitate the execution of SQL queries as the core logic for an Expectation. By default, UnexpectedRowsExpectation considers validation successful when no rows are returned by the provided SQL query.

Prerequisites

Procedure

  1. Determine your custom SQL query.

    The UnexpectedRowsExpectation class takes an unexpected_rows_query attribute, which is a SQL or Spark-SQL query that returns a selection of rows from the Batch of data being validated. By default, rows that are returned have failed the validation check.

    The custom SQL query should be written in the SQL dialect your database uses, except that it can also contain the special {batch} named query. When the Expectation is evaluated, the {batch} keyword will be replaced with the Batch of data that is configured for your Data Asset.

    In this example, the custom query will select any rows where the passenger count is greater than 6 or less than 0:

    Python
    my_query = """
    SELECT
    *
    FROM
    {batch}
    WHERE
    passenger_count > 6 or passenger_count < 0
    """
  2. Customize how the Expectation renders in Data Docs.

    As with other Expectations, the description attribute contains the text describing the Expectation when your results are rendered into Data Docs. You can format the description string with Markdown syntax:

    Python
    my_description = "There should be no more than **6** passengers."
  3. Create a new Expectation using the UnexpectedRowsExpectation class and your parameters.

    The class name UnexpectedRowsExpectation describes the functionality of the Expectation: it finds rows with unexpected values. When you create your Expectation, you can use a name that is more indicative of your specific use case. In this example, the customized Expectation will be used to find invalid passenger counts in taxi trip data:

    Python
    expect_passenger_count_to_be_legal = gx.expectations.UnexpectedRowsExpectation(
    unexpected_rows_query=my_query,
    description=my_description,
    )
  4. Use your custom SQL Expectation.

    Now that you've created a custom SQL Expectation, you can add it to an Expectation Suite and validate it like any other Expectation.