great_expectations.datasource.pandas_datasource

Module Contents

Classes

PandasDatasource(name=’pandas’, data_context=None, data_asset_type=None, batch_kwargs_generators=None, boto3_options=None, reader_method=None, reader_options=None, limit=None, **kwargs)

The PandasDatasource produces PandasDataset objects and supports generators capable of

great_expectations.datasource.pandas_datasource.logger
great_expectations.datasource.pandas_datasource.HASH_THRESHOLD = 1000000000.0
class great_expectations.datasource.pandas_datasource.PandasDatasource(name='pandas', data_context=None, data_asset_type=None, batch_kwargs_generators=None, boto3_options=None, reader_method=None, reader_options=None, limit=None, **kwargs)

Bases: great_expectations.datasource.datasource.Datasource

The PandasDatasource produces PandasDataset objects and supports generators capable of interacting with the local filesystem (the default subdir_reader generator), and from existing in-memory dataframes.

recognized_batch_parameters
classmethod build_configuration(cls, data_asset_type=None, batch_kwargs_generators=None, boto3_options=None, reader_method=None, reader_options=None, limit=None, **kwargs)

Build a full configuration object for a datasource, potentially including generators with defaults.

Parameters
  • data_asset_type – A ClassConfig dictionary

  • batch_kwargs_generators – Generator configuration dictionary

  • boto3_options – Optional dictionary with key-value pairs to pass to boto3 during instantiation.

  • reader_method – Optional default reader_method for generated batches

  • reader_options – Optional default reader_options for generated batches

  • limit – Optional default limit for generated batches

  • **kwargs – Additional kwargs to be part of the datasource constructor’s initialization

Returns

A complete datasource configuration.

process_batch_parameters(self, reader_method=None, reader_options=None, limit=None, dataset_options=None)

Use datasource-specific configuration to translate any batch parameters into batch kwargs at the datasource level.

Parameters
  • limit (int) – a parameter all datasources must accept to allow limiting a batch to a smaller number of rows.

  • dataset_options (dict) – a set of kwargs that will be passed to the constructor of a dataset built using these batch_kwargs

Returns

Result will include both parameters passed via argument and configured parameters.

Return type

batch_kwargs

get_batch(self, batch_kwargs, batch_parameters=None)

Get a batch of data from the datasource.

Parameters
  • batch_kwargs – the BatchKwargs to use to construct the batch

  • batch_parameters – optional parameters to store as the reference description of the batch. They should reflect parameters that would provide the passed BatchKwargs.

Returns

Batch

static guess_reader_method_from_path(path)
_get_reader_fn(self, reader_method=None, path=None)

Static helper for parsing reader types. If reader_method is not provided, path will be used to guess the correct reader_method.

Parameters
  • reader_method (str) – the name of the reader method to use, if available.

  • path (str) – the to use to guess

Returns

ReaderMethod to use for the filepath