Connect GX Cloud to Microsoft SQL Server
To connect GX Cloud to data stored in Microsoft SQL Server, you can use the GX Cloud UI or the GX Cloud API.
- UI
- API
Prerequisites
- A GX Cloud account with Workspace Editor permissions or greater.
- A Microsoft SQL Server database, schema, and table or view.
- Credentials that authorize read access to Microsoft SQL Server. You can use SQL Server Authentication or Entra ID.
Connect to a Microsoft SQL Server Data Source and add a Data Asset
-
In GX Cloud, select the relevant Workspace and then click Data Assets > New Data Asset > New Data Source > Microsoft SQL Server.
-
Enter a meaningful name for the Data Source in the Data Source name field.
-
Supply your connection details.
- Host: Enter the environment where the Microsoft SQL Server engine is installed and running, for example
sql-server.example.comfor a self-hosted Microsoft SQL Server instance. - Database: Enter the name of the Microsoft SQL Server database where the data you want to validate is stored.
- Schema: Enter the name of the Microsoft SQL Server schema where the data you want to validate is stored.
- Port: Enter the port configured for your Microsoft SQL Server instance, typically
1433. - Encrypt: Select a TLS encryption protocol:
- Optional: Establish an encrypted connection if your Microsoft SQL Server instance is configured to force encryption. Otherwise, establish an unencrypted connection.
- Mandatory: Require the connection to be encrypted. Validate the server certificate unless Trust server certificate is checked. Connection will fail if your Microsoft SQL Server instance does not support TLS. If Trust server certificate is not checked, connection will fail if the certificate is not valid and publicly trusted.
- Strict: Use TDS 8.0 where encryption begins before the TLS handshake. Require the connection to be encrypted and validate the server certificate. Connection will fail if your Microsoft SQL Server instance does not support TLS or the certificate is not valid and publicly trusted.
- Trust server certificate: If you set Encrypt to Mandatory, you can select Trust server certificate to enable using an encrypted connection without a valid publicly trusted server certificate. This lets you, for example, use a self-signed certificate with an encrypted connection.
- Driver: If you are using an agent-enabled deployment of GX Cloud, enter the name of the ODBC driver your environment uses to connect to Microsoft SQL Server. Common values include the following:
ODBC Driver 18 for SQL ServerODBC Driver 17 for SQL ServerFreeTDS
- Authentication method: Select SQL Server or Entra ID Service Principal. Depending on your selection, the following credential fields will be required:
- SQL Server:
- Username: Enter the username you use to access Microsoft SQL Server.
- Password: Enter the password you use to access Microsoft SQL Server.
- Entra ID Service Principal:
- Tenant ID: Enter the unique identifier for your organization's instance of Microsoft Entra ID.
- Client ID: Enter the application ID for your new or existing app registration.
- Client secret: Enter a new secret key from your app registration.
- SQL Server:
- Host: Enter the environment where the Microsoft SQL Server engine is installed and running, for example
-
Click Connect.
-
Select one or more tables or views to import as Data Assets.
-
Click Add x Asset(s).
-
Decide which Anomaly Detection options you want to enable. By default, GX Cloud adds warning-severity Expectations to detect Schema and Volume anomalies. You can de-select recommendations you’d like to opt out of. You can choose to generate Expectations to detect Completeness anomalies.
-
Click Start monitoring or Finish.
Prerequisites
- A GX Cloud account with Workspace Editor permissions or greater.
- A Microsoft SQL Server database, schema, and table or view.
- Credentials that authorize read access to Microsoft SQL Server. You can use SQL Server Authentication or Entra ID.
- Python version 3.10 to 3.13.
- Recommended. A Python virtual environment.
Install GX Cloud
Run the following terminal command to install the GX Cloud library with support for Microsoft SQL Server dependencies:
pip install 'great_expectations[sql-server]'
Get your credentials
You'll need your user access token, organization ID, and workspace ID to set your environment variables. Don't commit your access token to your version control software.
- In GX Cloud, click Tokens.
- In the User access tokens pane, click Create user access token.
- In the Token name field, enter a name for the token that will help you quickly identify it.
- Click Create.
- Copy and then paste the user access token into a temporary file. The token can't be retrieved after you close the dialog.
- Click Close.
- Copy the value in the Organization ID field into the temporary file with your user access token.
- In the Workspace ID pane, find the relevant Workspace name, then copy the associated ID into the temporary file with your other credentials and save the file.
GX recommends deleting the temporary file after you set the environment variables.
Set your credentials as environment variables
Environment variables securely store your GX Cloud and Microsoft SQL Server credentials.
-
Save your GX Cloud and Microsoft SQL Server credentials as environment variables by entering
export ENV_VAR_NAME=env_var_valuein the terminal or adding the command to your~/.bashrcor~/.zshrcfile. For example:Terminal inputexport GX_CLOUD_ACCESS_TOKEN=<user_access_token>
export GX_CLOUD_ORGANIZATION_ID=<organization_id>
export GX_CLOUD_WORKSPACE_ID=<workspace_id>
export SQL_SERVER_USER=<username>
export SQL_SERVER_PASSWORD=<password> -
Optional. If you created a temporary file to record your credentials, delete it.
Connect a Microsoft SQL Server Data Source and add a Data Asset
- Instructions
- Sample code
-
Run the following Python code to create a Data Context object:
Pythonimport great_expectations as gx
context = gx.get_context(mode="cloud")The Data Context will detect the previously set environment variables and connect to your GX Cloud account.
-
Define the Data Source's parameters.
The following information is required when you create a Microsoft SQL Server Data Source:
name: A descriptive name used to reference the Data Source. This should be unique within your workspace.host: The environment where the Microsoft SQL Server engine is installed and running, for examplesql-server.example.comfor a self-hosted Microsoft SQL Server instance.database: The name of the Microsoft SQL Server database where the data you want to validate is stored.schema: The name of the Microsoft SQL Server schema where the data you want to validate is stored.port: The port configured for your Microsoft SQL Server instance, typically1433.encrypt: The TLS encryption protocol to use. Accepts the following:Optional: Establish an encrypted connection if your Microsoft SQL Server instance is configured to force encryption. Otherwise, establish an unencrypted connection.Mandatory: Require the connection to be encrypted. Validate the server certificate unlesstrust_server_certificateis set toTrue. Connection will fail if your Microsoft SQL Server instance does not support TLS. Iftrust_server_certificateis set toFalse, connection will fail if the certificate is not valid and publicly trusted.Strict: Use TDS 8.0 where encryption begins before the TLS handshake. Require the connection to be encrypted and validate the server certificate. Connection will fail if your Microsoft SQL Server instance does not support TLS or the certificate is not valid and publicly trusted.
trust_server_certificate: If you setencrypttoMandatory, you can settrust_server_certificatetoTrueto enable using an encrypted connection without a valid publicly trusted server certificate (default isFalse). This lets you, for example, use a self-signed certificate with an encrypted connection.driver: If you are using an agent-enabled deployment of GX Cloud, provide the name of the ODBC driver your environment uses to connect to Microsoft SQL Server. Common values include the following:ODBC Driver 18 for SQL ServerODBC Driver 17 for SQL ServerFreeTDS
authentication: AcceptsSQL ServerorEntra ID. Depending on your selection, the following credential parameters will be required:- SQL Server:
username: The username you use to access Microsoft SQL Server.password: The password you use to access Microsoft SQL Server.
- Entra ID:
tenant_id: The unique identifier for your organization's instance of Microsoft Entra ID.client_id: The application ID for your new or existing app registration.client_secret: A new secret key from your app registration.
- SQL Server:
Replace the variable values with your own and run the following Python code. In this example, the strings
"${SQL_SERVER_USER}"and"${SQL_SERVER_PASSWORD}"will be replaced with the values of the environment variables you set earlier:Pythondatasource_name = "Self-hosted SQL Server"
host = "sql-server.example.com"
database = "production"
schema = "sales"
port = 1433
encrypt = "Mandatory"
authentication = "SQL Server"
username = "${SQL_SERVER_USER}"
password = "${SQL_SERVER_PASSWORD}" -
Add a Microsoft SQL Server Data Source to your Data Context by executing the following code:
Pythondata_source = context.data_sources.add_sql_server(
name=datasource_name,
host=host,
database=database,
schema=schema,
port=port,
encrypt=encrypt,
authentication=authentication,
username=username,
password=password,
) -
Decide whether you want to validate the records in a single table or the records returned by a SQL query.
- To validate the records in a single table, you will create a Table Data Asset.
- To validate the records returned by a SQL query, you will create a Query Data Asset. Note that Query Data Assets have some limitations compared to Table Data Assets.
- Table Data Asset
- Query Data Asset
-
Define your Table Data Asset's parameters.
The following information is required when you create a Microsoft SQL Server Table Data Asset:
name: A name by which you can reference the Data Asset in the future. This should be unique within the Data Source.table_name: The name of the SQL table that the Table Data Asset will retrieve records from.
Pythondata_asset_name = "my_table_asset"
table_name = "my_table" -
Add the Data Asset to your Data Source. A new Data Asset is created and added to a Data Source simultaneously.
Pythontable_data_asset = data_source.add_table_asset(
table_name=table_name, name=data_asset_name
)
-
Define your Query Data Asset's parameters.
The following information is required when you create a Microsoft SQL Server Query Data Asset:
name: A name by which you can reference the Data Asset in the future. This should be unique within the Data Source.query: The SQL query that the Data Asset will retrieve records from.
Pythondata_asset_name = "my_query_asset"
query = "SELECT * from my_table WHERE column1 = 'value' AND column2 > 20" -
Add the Data Asset to your Data Source. A new Data Asset is created and added to a Data Source simultaneously.
Pythonquery_data_asset = data_source.add_query_asset(query=query, name=data_asset_name)
import great_expectations as gx
context = gx.get_context(mode="cloud")
# Add a Microsoft SQL Server Data Source
datasource_name = "Self-hosted SQL Server"
host = "sql-server.example.com"
database = "production"
schema = "sales"
port = 1433
encrypt = "Mandatory"
authentication = "SQL Server"
username = "${SQL_SERVER_USER}"
password = "${SQL_SERVER_PASSWORD}"
data_source = context.data_sources.add_sql_server(
name=datasource_name,
host=host,
database=database,
schema=schema,
port=port,
encrypt=encrypt,
authentication=authentication,
username=username,
password=password,
)
# Add a Table Data Asset
data_asset_name = "my_table_asset"
table_name = "my_table"
table_data_asset = data_source.add_table_asset(
table_name=table_name, name=data_asset_name
)
# Get the updated Data Source
data_source = context.data_sources.get(datasource_name)
# Add a Query Data Asset
data_asset_name = "my_query_asset"
query = "SELECT * from my_table WHERE column1 = 'value' AND column2 > 20"
query_data_asset = data_source.add_query_asset(query=query, name=data_asset_name)