great_expectations.util
¶
Module Contents¶
Functions¶
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Convert a (pandas) dataframe to a great_expectations dataset, with (optional) expectation_suite |
|
Convert a (pandas) dataframe to a great_expectations dataset, with (optional) expectation_suite |
|
Read a file using Pandas read_csv and return a great_expectations dataset. |
|
Read a file using Pandas read_json and return a great_expectations dataset. |
|
Read a file using Pandas read_excel and return a great_expectations dataset. |
|
Read a file using Pandas read_table and return a great_expectations dataset. |
|
Read a file using Pandas read_feather and return a great_expectations dataset. |
|
Read a file using Pandas read_parquet and return a great_expectations dataset. |
|
Read a Pandas data frame and return a great_expectations dataset. |
|
Read a file using Pandas read_pickle and return a great_expectations dataset. |
|
Validate the provided data asset. Validate can accept an optional data_asset_name to apply, data_context to use |
|
Print the structure of directory as a tree: |
|
Lint strings of code passed in. Optional dependency “black” must be installed. |
|
Filter the entries of the source dictionary according to directives concerning the existing keys and values. |
|
|
|
|
|
|
|
|
|
Really basic sanity checking. |
|
|
Generate the JSON object used to populate the public gallery |
-
great_expectations.util.
logger
¶
-
great_expectations.util.
profile
(func: Callable = None) → Callable¶
-
great_expectations.util.
measure_execution_time
(func: Callable = None) → Callable¶
-
great_expectations.util.
get_project_distribution
() → Optional[Distribution]¶
-
great_expectations.util.
get_currently_executing_function
() → Callable¶
-
great_expectations.util.
get_currently_executing_function_call_arguments
(include_module_name: bool = False, include_caller_names: bool = False, **kwargs) → dict¶ - Parameters
include_module_name – bool If True, module name will be determined and included in output dictionary (default is False)
include_caller_names – bool If True, arguments, such as “self” and “cls”, if present, will be included in output dictionary (default is False)
kwargs –
- Returns
dict Output dictionary, consisting of call arguments as attribute “name: value” pairs.
Example usage: # Gather the call arguments of the present function (include the “module_name” and add the “class_name”), filter # out the Falsy values, and set the instance “_config” variable equal to the resulting dictionary. self._config = get_currently_executing_function_call_arguments(
) filter_properties_dict(properties=self._config, inplace=True)
-
great_expectations.util.
verify_dynamic_loading_support
(module_name: str, package_name: str = None) → None¶ - Parameters
module_name – a possibly-relative name of a module
package_name – the name of a package, to which the given module belongs
-
great_expectations.util.
import_library_module
(module_name: str) → Optional[ModuleType]¶ - Parameters
module_name – a fully-qualified name of a module (e.g., “great_expectations.dataset.sqlalchemy_dataset”)
- Returns
raw source code of the module (if can be retrieved)
-
great_expectations.util.
is_library_loadable
(library_name: str) → bool¶
-
great_expectations.util.
load_class
(class_name: str, module_name: str)¶
-
great_expectations.util.
_convert_to_dataset_class
(df, dataset_class, expectation_suite=None, profiler=None)¶ Convert a (pandas) dataframe to a great_expectations dataset, with (optional) expectation_suite
- Parameters
df – the DataFrame object to convert
dataset_class – the class to which to convert the existing DataFrame
expectation_suite – the expectation suite that should be attached to the resulting dataset
profiler – the profiler to use to generate baseline expectations, if any
- Returns
A new Dataset object
-
great_expectations.util.
_load_and_convert_to_dataset_class
(df, class_name, module_name, expectation_suite=None, profiler=None)¶ Convert a (pandas) dataframe to a great_expectations dataset, with (optional) expectation_suite
- Parameters
df – the DataFrame object to convert
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
expectation_suite – the expectation suite that should be attached to the resulting dataset
profiler – the profiler to use to generate baseline expectations, if any
- Returns
A new Dataset object
-
great_expectations.util.
read_csv
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_csv and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
read_json
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, accessor_func=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_json and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
accessor_func (Callable) – functions to transform the json object in the file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
read_excel
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_excel and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset or ordered dict of great_expectations datasets, if multiple worksheets are imported
-
great_expectations.util.
read_table
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_table and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
read_feather
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_feather and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
read_parquet
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_parquet and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
from_pandas
(pandas_df, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None)¶ Read a Pandas data frame and return a great_expectations dataset.
- Parameters
pandas_df (Pandas df) – Pandas data frame
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (profiler class) – The profiler that should be run on the dataset to establish a baseline expectation suite.
- Returns
great_expectations dataset
-
great_expectations.util.
read_pickle
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_pickle and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
validate
(data_asset, expectation_suite=None, data_asset_name=None, expectation_suite_name=None, data_context=None, data_asset_class_name=None, data_asset_module_name='great_expectations.dataset', data_asset_class=None, *args, **kwargs)¶ Validate the provided data asset. Validate can accept an optional data_asset_name to apply, data_context to use to fetch an expectation_suite if one is not provided, and data_asset_class_name/data_asset_module_name or data_asset_class to use to provide custom expectations.
- Parameters
data_asset – the asset to validate
expectation_suite – the suite to use, or None to fetch one using a DataContext
data_asset_name – the name of the data asset to use
expectation_suite_name – the name of the expectation_suite to use
data_context – data context to use to fetch an an expectation suite, or the path from which to obtain one
data_asset_class_name – the name of a class to dynamically load a DataAsset class
data_asset_module_name – the name of the module to dynamically load a DataAsset class
data_asset_class – a class to use. overrides data_asset_class_name/ data_asset_module_name if provided
*args –
**kwargs –
Returns:
-
great_expectations.util.
gen_directory_tree_str
(startpath)¶ Print the structure of directory as a tree:
Ex: project_dir0/
AAA/ BBB/
aaa.txt bbb.txt
#Note: files and directories are sorted alphabetically, so that this method can be used for testing.
-
great_expectations.util.
lint_code
(code: str) → str¶ Lint strings of code passed in. Optional dependency “black” must be installed.
-
great_expectations.util.
filter_properties_dict
(properties: dict, keep_fields: Optional[list] = None, delete_fields: Optional[list] = None, clean_empty: Optional[bool] = True, inplace: Optional[bool] = False) → Optional[dict]¶ Filter the entries of the source dictionary according to directives concerning the existing keys and values.
- Parameters
properties – source dictionary to be filtered according to the supplied filtering directives
keep_fields – list of keys that must be retained, with the understanding that all other entries will be deleted
delete_fields – list of keys that must be deleted, with the understanding that all other entries will be retained
clean_empty – If True, then in addition to other filtering directives, delete entries, whose values are Falsy
inplace – If True, then modify the source properties dictionary; otherwise, make a copy for filtering purposes
- Returns
The (possibly) filtered properties dictionary (or None if no entries remain after filtering is performed)
-
great_expectations.util.
is_numeric
(value: Any) → bool¶
-
great_expectations.util.
is_int
(value: Any) → bool¶
-
great_expectations.util.
is_float
(value: Any) → bool¶
-
great_expectations.util.
is_parseable_date
(value: Any, fuzzy: bool = False) → bool¶
-
great_expectations.util.
get_context
()¶
-
great_expectations.util.
is_sane_slack_webhook
(url: str) → bool¶ Really basic sanity checking.
-
great_expectations.util.
is_list_of_strings
(_list) → bool¶
-
great_expectations.util.
generate_library_json_from_registered_expectations
()¶ Generate the JSON object used to populate the public gallery