great_expectations.profile.basic_suite_builder_profiler

Module Contents

Classes

BasicSuiteBuilderProfiler()

This profiler helps build coarse expectations for columns you care about.

Functions

_check_that_expectations_are_available(dataset, expectations)

_check_that_columns_exist(dataset, columns)

_is_nan(value)

class great_expectations.profile.basic_suite_builder_profiler.BasicSuiteBuilderProfiler

Bases: great_expectations.profile.basic_dataset_profiler.BasicDatasetProfilerBase

This profiler helps build coarse expectations for columns you care about.

The goal of this profiler is to expedite the process of authoring an expectation suite by building possibly relevant expections for columns that you care about. You can then easily edit the suite and adjust or delete these expectations to hone your new suite.

Ranges of acceptable values in the expectations created by this profiler (for example, the min/max of the value in expect_column_values_to_be_between) are created only to demonstrate the functionality and should not be taken as the actual ranges. You should definitely edit this coarse suite.

Configuration is optional, and if not provided, this profiler will create expectations for all columns.

Configuration is a dictionary with a columns key containing a list of the column names you want coarse expectations created for. This dictionary can also contain a excluded_expectations key with a list of expectation names you do not want created or a included_expectations key with a list of expectation names you want created (if applicable).

For example, if you had a wide patients table and you want expectations on three columns, you’d do this:

suite, validation_result = BasicSuiteBuilderProfiler().profile(

dataset, {“columns”: [“id”, “username”, “address”]}

)

For example, if you had a wide patients table and you want expectations on all columns, excluding three statistical expectations, you’d do this:

suite, validation_result = BasicSuiteBuilderProfiler().profile(

dataset, {

“excluded_expectations”: [

“expect_column_mean_to_be_between”, “expect_column_median_to_be_between”, “expect_column_quantile_values_to_be_between”,

],

}

)

For example, if you had a wide patients table and you want only two types of expectations on all applicable columns you’d do this:

suite, validation_result = BasicSuiteBuilderProfiler().profile(

dataset, {

“included_expectations”: [

“expect_column_to_not_be_null”, “expect_column_values_to_be_in_set”,

],

}

)

It can also be used to generate an expectation suite that contains one instance of every interesting expectation type.

When used in this “demo” mode, the suite is intended to demonstrate of the expressive power of expectations and provide a service similar to the one expectations glossary documentation page, but on a users’ own data.

suite, validation_result = BasicSuiteBuilderProfiler().profile(dataset, configuration=”demo”)

classmethod _get_column_type_with_caching(cls, dataset, column_name, cache)
classmethod _get_column_cardinality_with_caching(cls, dataset, column_name, cache)
classmethod _create_expectations_for_low_card_column(cls, dataset, column, column_cache, excluded_expectations=None, included_expectations=None)
classmethod _create_non_nullity_expectations(cls, dataset, column, excluded_expectations=None, included_expectations=None)
classmethod _create_expectations_for_numeric_column(cls, dataset, column, excluded_expectations=None, included_expectations=None)
classmethod _create_expectations_for_string_column(cls, dataset, column, excluded_expectations=None, included_expectations=None)
classmethod _find_next_low_card_column(cls, dataset, columns, profiled_columns, column_cache)
classmethod _find_next_numeric_column(cls, dataset, columns, profiled_columns, column_cache)
classmethod _find_next_string_column(cls, dataset, columns, profiled_columns, column_cache)
classmethod _find_next_datetime_column(cls, dataset, columns, profiled_columns, column_cache)
classmethod _create_expectations_for_datetime_column(cls, dataset, column, excluded_expectations=None, included_expectations=None)
classmethod _profile(cls, dataset, configuration=None)
classmethod _demo_profile(cls, dataset)
classmethod _build_table_row_count_expectation(cls, dataset, tolerance=0.1, excluded_expectations=None, included_expectations=None)
classmethod _build_table_column_expectations(cls, dataset, excluded_expectations=None, included_expectations=None)
classmethod _build_column_description_metadata(cls, dataset)
great_expectations.profile.basic_suite_builder_profiler._check_that_expectations_are_available(dataset, expectations)
great_expectations.profile.basic_suite_builder_profiler._check_that_columns_exist(dataset, columns)
great_expectations.profile.basic_suite_builder_profiler._is_nan(value)