How to set up Great Expectations to work with general SQL databases
The Great Expectations CLI is no longer the preferred method for implementing and configuring Great Expectations. This topic will be updated soon to reflect this change. For more information, see A fond farewell to the CLI.
This guide will walk you through best practices for creating your GX Python environment and demonstrate how to locally install Great Expectations along with the necessary dependencies for working with SQL databases.
- A supported version of Python (3.7 to 3.10). To download and install Python, see Python downloads.
- The ability to install Python modules with pip
1. Check your Python version
You can check your version of Python by running:
GX currently supports Python versions 3.7 to 3.10
Depending on your installation and configuration of Python 3, you may find that executing Python commands from the terminal by calling
python doesn't work as desired. If a command using
python does not work, try using
If this produces the desired result, simply replace
python3 in our example terminal commands.
If this does not work, you may need to look into your Python 3 installation or configuration.
2. Create a Python virtual environment
As a best practice, we recommend using a virtual environment to partition your GX installation from any other Python projects that may exist on the same system. This ensures that there will not be dependency conflicts between the GX installation and other Python projects.
Once we have confirmed that Python 3 is installed locally, we can create a virtual environment with
We have chosen to use
venv for virtual environments in this guide because it is included with Python 3. You are not limited to using
venv, and can just as easily install Great Expectations into virtual environments with tools such as
We will create our virtual environment by running:
python -m venv my_venv
This command will create a new directory called
my_venv. Our virtual environment will be located in this directory.
In order to activate the virtual environment we will run:
You can name your virtual environment anything you like. Simply replace
my_venv in the examples above with the name that you would like to use.
3. Install GX with optional dependencies for SQL databases
To install Great Expectations with the optional dependencies needed to work with SQL databases we execute the following pip command from the terminal:
pip install 'great_expectations[sqlalchemy]'
The above pip instruction will install GX with basic SQL support through SqlAlchemy. However, certain SQL dialects require additional dependencies. Depending on the SQL database type you will be working with, you may wish to use one of the following installation commands, instead:
- AWS Athena:
pip install 'great_expectations[athena]'
pip install 'great_expectations[bigquery]'
pip install 'great_expectations[mssql]'
pip install 'great_expectations[postgresql]'
pip install 'great_expectations[redshift]'
pip install 'great_expectations[snowflake]'
pip install 'great_expectations[trino]'
Great Expectations does not currently support SqlAlchemy 2.0.
If you install SqlAlchemy independently of the above pip commands, be certain to install the most recent SqlAlchemy version prior to 2.0.
4. Verify that GX has been installed correctly
You can verify that GX installed successfully with the CLI command:
The output you receive if GX was successfully installed will be:
great_expectations, version 0.16.15
5. Setting up credentials
Different SQL dialects have different requirements for connection strings and methods of configuring credentials. By default, GX allows you to define credentials as environment variables or as values in your Data Context (once you have initialized one).
There may also be third party utilities for setting up credentials of a given SQL database type. For more information on setting up credentials for a given source database, please reference the official documentation for that SQL dialect as well as our guide on [how to set up credentials(/docs/guides/setup/configuring_data_contexts/how_to_configure_credentials).
Now that you have installed GX with the necessary dependencies for working with SQL databases, you are ready to initialize your Data ContextThe primary entry point for a Great Expectations deployment, with configurations and methods for all supporting components.. The Data Context will contain your configurations for GX components, as well as provide you with access to GX's Python API.
To quickly create a Data Context and dive into working with GX, please see:
To initialize a Data Context on your filesystem, please reference:
To work with a temporary, in-memory Data Context, see: