How to quickly connect to a single file using Pandas
In this guide we will demonstrate how to use Pandas to connect to data stored in files on a filesystem. In this example we will specifically be connecting to data in
.csv format. However, GX supports most read methods available through Pandas.
- A Great Expectations instance. See Install Great Expectations locally.
- A Data Context.
- Access to source data stored in a filesystem
1. Import the Great Expectations module and instantiate a Data Context
The code to import Great Expectations and instantiate a Data Context is:
import great_expectations as gx
context = gx.get_context()
2. Specify a file to read into a Data Asset
Great Expectations supports reading the data in individual files directly into a Validator using Pandas. To do this, we will run the code:
validator = context.sources.pandas_default.read_csv(
In this example, we are connecting to a csv file. However, Great Expectations supports connecting to most types of files that Pandas has
read_* methods for.
Because you will be using Pandas to connect to these files, the specific
add_*_asset methods that will be available to you will be determined by your currently installed version of Pandas.
For more information on which Pandas
read_* methods are available to you as
add_*_asset methods, please reference the official Pandas Input/Output documentation for the version of Pandas that you have installed.
In the GX Python API,
add_*_asset methods will require the same parameters as the corresponding Pandas
read_* method, with one caveat: In Great Expectations, you will also be required to provide a value for an
Now that you have a Validator, you can immediately move on to creating Expectations. For more information, please see: