How to connect to a SQL database
In this guide we will demonstrate how to connect Great Expectations to SQL databases. GX uses SQLAlchemy to connect to SQL data, and therefore supports most SQL dialects that SQLAlchemy does. For more information on the SQL dialects supported by SQLAlchemy, please see SQLAlchemy's official documentation on dialects.
- An installation of GX set up to work with SQL
- Source data stored in a SQL database
1. Import GX and instantiate a Data Context
The code to import Great Expectations and instantiate a Data Context is:
import great_expectations as gx
context = gx.get_context()
2. Determine your connection string
GX supports a variety of different SQL source data systems. However, most SQL dialects have their own specifications for how to define a connection string. You should reference the corresponding dialect's official documentation to determine the connection string for your SQL Database.
The following are examples of connection strings for different SQL dialects:
- AWS Athena:
For purposes of this guide's examples, we will connect to a PostGreSQL database. Here is an example of our connection string, stored in the variable
connection_string with plain text credentials:
connection_string = "postgresql+psycopg2://username:my_password@localhost/test"
You can use either environment variables or a key in
config_variables.yml to safely store any passwords needed by your connection string. After defining your password in one of those ways, you can reference it in your connection string like this:
connection_string = (
In the above example
MY_PASSWORD would be the name of the environment variable or the key to the value in
config_variables.yml that corresponds to your password.
If you include a password as plain text in your connection string when you define your Datasource, GX will automatically strip it out, add it to
config_variables.yml and substitute it in the Datasource's saved configuration with a variable as was shown above.
3. Create a SQL Datasource
Creating a SQL Datasource is as simple as providing the
add_sql(...) method a
name by which to reference it in the future and the
connection_string with which to access it.
datasource = context.sources.add_sql(
Now that you have connected to a SQL database, next you will want to: