great_expectations.profile.base

Module Contents

Classes

OrderedEnum()

Generic enumeration.

OrderedProfilerCardinality()

Generic enumeration.

ProfilerDataType()

Useful data types for building profilers.

ProfilerCardinality()

Useful cardinality categories for building profilers.

ProfilerTypeMapping()

Useful backend type mapping for building profilers.

Profiler(configuration: dict = None)

Profilers creates suites from various sources of truth.

DataAssetProfiler()

DatasetProfiler()

great_expectations.profile.base.logger
class great_expectations.profile.base.OrderedEnum

Bases: enum.Enum

Generic enumeration.

Derive from this class to define new enumerations.

__ge__(self, other)

Return self>=value.

__gt__(self, other)

Return self>value.

__le__(self, other)

Return self<=value.

__lt__(self, other)

Return self<value.

class great_expectations.profile.base.OrderedProfilerCardinality

Bases: great_expectations.profile.base.OrderedEnum

Generic enumeration.

Derive from this class to define new enumerations.

NONE = 0
ONE = 1
TWO = 2
VERY_FEW = 3
FEW = 4
MANY = 5
VERY_MANY = 6
UNIQUE = 7
classmethod get_basic_column_cardinality(cls, num_unique=0, pct_unique=0)

Takes the number and percentage of unique values in a column and returns the column cardinality. If you are unexpectedly returning a cardinality of “None”, ensure that you are passing in values for both num_unique and pct_unique. :param num_unique: The number of unique values in a column :param pct_unique: The percentage of unique values in a column

Returns

The column cardinality

class great_expectations.profile.base.ProfilerDataType

Bases: enum.Enum

Useful data types for building profilers.

INT = int
FLOAT = float
NUMERIC = numeric
STRING = string
BOOLEAN = boolean
DATETIME = datetime
UNKNOWN = unknown
class great_expectations.profile.base.ProfilerCardinality

Bases: enum.Enum

Useful cardinality categories for building profilers.

NONE = none
ONE = one
TWO = two
FEW = few
VERY_FEW = very few
MANY = many
VERY_MANY = very many
UNIQUE = unique
class great_expectations.profile.base.ProfilerTypeMapping

Useful backend type mapping for building profilers.

INT_TYPE_NAMES = ['INTEGER', 'integer', 'int', 'int_', 'int8', 'int16', 'int32', 'int64', 'uint8', 'uint16', 'uint32', 'uint64', 'INT', 'TINYINT', 'BYTEINT', 'SMALLINT', 'BIGINT', 'IntegerType', 'LongType', 'DECIMAL']
FLOAT_TYPE_NAMES = ['FLOAT', 'DOUBLE', 'FLOAT4', 'FLOAT8', 'DOUBLE_PRECISION', 'NUMERIC', 'FloatType', 'DoubleType', 'float_', 'float16', 'float32', 'float64', 'number', 'DECIMAL']
STRING_TYPE_NAMES = ['CHAR', 'VARCHAR', 'NVARCHAR', 'TEXT', 'STRING', 'StringType', 'string', 'str']
BOOLEAN_TYPE_NAMES = ['BOOLEAN', 'boolean', 'BOOL', 'TINYINT', 'BIT', 'bool', 'BooleanType']
DATETIME_TYPE_NAMES = ['DATETIME', 'DATE', 'TIME', 'TIMESTAMP', 'DateType', 'TimestampType', 'datetime64', 'Timestamp', 'datetime64[ns]']
great_expectations.profile.base.profiler_data_types_with_mapping
great_expectations.profile.base.profiler_semantic_types
class great_expectations.profile.base.Profiler(configuration: dict = None)

Profilers creates suites from various sources of truth.

These sources of truth can be data or non-data sources such as DDLs.

When implementing a Profiler ensure that you: - Implement a . _profile() method - Optionally implement .validate() method that verifies you are running on the right

kind of object. You should raise an appropriate Exception if the object is not valid.

validate(self, item_to_validate: Any)
profile(self, item_to_profile: Any, suite_name: str = None)
abstract _profile(self, item_to_profile: Any, suite_name: str = None)
class great_expectations.profile.base.DataAssetProfiler
classmethod validate(cls, data_asset)
class great_expectations.profile.base.DatasetProfiler

Bases: great_expectations.profile.base.DataAssetProfiler

classmethod validate(cls, dataset)
classmethod add_expectation_meta(cls, expectation)
classmethod add_meta(cls, expectation_suite, batch_kwargs=None)
classmethod profile(cls, data_asset, run_id=None, profiler_configuration=None, run_name=None, run_time=None)
abstract classmethod _profile(cls, dataset, configuration=None)