Connect GX Cloud to Microsoft Fabric
To connect GX Cloud to data stored in Microsoft Fabric, you can use the GX Cloud UI or the GX Cloud API.
- UI
- API
Prerequisites
- A GX Cloud account with Workspace Editor permissions or greater.
- A Microsoft Fabric database, schema, and table or view.
- Entra ID credentials that authorize read access to Microsoft Fabric.
Connect to a Microsoft Fabric Data Source and add a Data Asset
-
In GX Cloud, select the relevant Workspace and then click Data Assets > New Data Asset > New Data Source > Microsoft Fabric.
-
Enter a meaningful name for the Data Source in the Data Source name field.
-
Supply your connection details.
- Host: Enter your Microsoft Fabric workload endpoint, for example
myworkspace.datawarehouse.fabric.microsoft.comorabc123.database.fabric.microsoft.com. - Database: Enter the name of the Microsoft Fabric database where the data you want to validate is stored.
- Schema: Enter the name of the Microsoft Fabric schema where the data you want to validate is stored.
- Port: Enter the port configured for your Microsoft Fabric instance, typically
1433. - Encrypt: Select a TLS encryption protocol:
- Optional: Establish an encrypted connection if your Microsoft Fabric instance is configured to force encryption. Otherwise, establish an unencrypted connection.
- Mandatory: Require the connection to be encrypted. Validate the server certificate unless Trust server certificate is checked. Connection will fail if your Microsoft Fabric instance does not support TLS. If Trust server certificate is not checked, connection will fail if the certificate is not valid and publicly trusted.
- Strict: Use TDS 8.0 where encryption begins before the TLS handshake. Require the connection to be encrypted and validate the server certificate. Connection will fail if your Microsoft Fabric instance does not support TLS or the certificate is not valid and publicly trusted.
- Trust server certificate: If you set Encrypt to Mandatory, you can select Trust server certificate to enable using an encrypted connection without a valid publicly trusted server certificate. This lets you, for example, use a self-signed certificate with an encrypted connection.
- Driver: If you are using an agent-enabled deployment of GX Cloud, enter the name of the ODBC driver your environment uses to connect to Microsoft Fabric. Common values include the following:
ODBC Driver 18 for SQL ServerODBC Driver 17 for SQL ServerFreeTDS
- Tenant ID: Enter the unique identifier for your organization's instance of Microsoft Entra ID.
- Client ID: Enter the application ID for your new or existing app registration.
- Client secret: Enter a new secret key from your app registration.
- Host: Enter your Microsoft Fabric workload endpoint, for example
-
Click Connect.
-
Select one or more tables or views to import as Data Assets.
-
Click Add x Asset(s).
-
Decide which Anomaly Detection options you want to enable. By default, GX Cloud adds warning-severity Expectations to detect Schema and Volume anomalies. You can de-select recommendations you’d like to opt out of. You can choose to generate Expectations to detect Completeness anomalies.
-
Click Start monitoring or Finish.
Prerequisites
- A GX Cloud account with Workspace Editor permissions or greater.
- A Microsoft Fabric database, schema, and table or view.
- Entra ID credentials that authorize read access to Microsoft Fabric.
- Python version 3.10 to 3.13.
- Recommended. A Python virtual environment.
Install GX Cloud
Run the following terminal command to install the GX Cloud library with support for Microsoft Fabric dependencies:
pip install 'great_expectations[fabric]'
Get your credentials
You'll need your user access token, organization ID, and workspace ID to set your environment variables. Don't commit your access token to your version control software.
- In GX Cloud, click Tokens.
- In the User access tokens pane, click Create user access token.
- In the Token name field, enter a name for the token that will help you quickly identify it.
- Click Create.
- Copy and then paste the user access token into a temporary file. The token can't be retrieved after you close the dialog.
- Click Close.
- Copy the value in the Organization ID field into the temporary file with your user access token.
- In the Workspace ID pane, find the relevant Workspace name, then copy the associated ID into the temporary file with your other credentials and save the file.
GX recommends deleting the temporary file after you set the environment variables.
Set your credentials as environment variables
Environment variables securely store your GX Cloud and Microsoft Fabric credentials.
-
Save your GX Cloud and Microsoft Fabric credentials as environment variables by entering
export ENV_VAR_NAME=env_var_valuein the terminal or adding the command to your~/.bashrcor~/.zshrcfile. For example:Terminal inputexport GX_CLOUD_ACCESS_TOKEN=<user_access_token>
export GX_CLOUD_ORGANIZATION_ID=<organization_id>
export GX_CLOUD_WORKSPACE_ID=<workspace_id>
export ENTRA_ID_TENANT=<tenant_id>
export ENTRA_ID_CLIENT_ID=<client_id>
export ENTRA_ID_CLIENT_SECRET=<client_secret> -
Optional. If you created a temporary file to record your credentials, delete it.
Connect a Microsoft Fabric Data Source and add a Data Asset
- Instructions
- Sample code
-
Run the following Python code to create a Data Context object:
Pythonimport great_expectations as gx
context = gx.get_context(mode="cloud")The Data Context will detect the previously set environment variables and connect to your GX Cloud account.
-
Define the Data Source's parameters.
The following information is required when you create a Microsoft Fabric Data Source:
name: A descriptive name used to reference the Data Source. This should be unique within your workspace.host: Your Microsoft Fabric workload endpoint, for examplemyworkspace.datawarehouse.fabric.microsoft.comorabc123.database.fabric.microsoft.com.database: The name of the Microsoft Fabric database where the data you want to validate is stored.schema: The name of the Microsoft Fabric schema where the data you want to validate is stored.port: The port configured for your Microsoft Fabric instance, typically1433.encrypt: The TLS encryption protocol to use. Accepts the following:Optional: Establish an encrypted connection if your Microsoft Fabric instance is configured to force encryption. Otherwise, establish an unencrypted connection.Mandatory: Require the connection to be encrypted. Validate the server certificate unlesstrust_server_certificateis set toTrue. Connection will fail if your Microsoft Fabric instance does not support TLS. Iftrust_server_certificateis set toFalse, connection will fail if the certificate is not valid and publicly trusted.Strict: Use TDS 8.0 where encryption begins before the TLS handshake. Require the connection to be encrypted and validate the server certificate. Connection will fail if your Microsoft Fabric instance does not support TLS or the certificate is not valid and publicly trusted.
trust_server_certificate: If you setencrypttoMandatory, you can settrust_server_certificatetoTrueto enable using an encrypted connection without a valid publicly trusted server certificate (default isFalse). This lets you, for example, use a self-signed certificate with an encrypted connection.driver: If you are using an agent-enabled deployment of GX Cloud, provide the name of the ODBC driver your environment uses to connect to Microsoft Fabric. Common values include the following:ODBC Driver 18 for SQL ServerODBC Driver 17 for SQL ServerFreeTDS
tenant_id: The unique identifier for your organization's instance of Microsoft Entra ID.client_id: The application ID for your new or existing app registration.client_secret: A new secret key from your app registration.
Replace the variable values with your own and run the following Python code. In this example, the
"${ENTRA_ID_*}"strings will be replaced with the values of the environment variables you set earlier:Pythondatasource_name = "Fabric"
host = "myworkspace.datawarehouse.fabric.microsoft.com"
database = "production"
schema = "sales"
port = 1433
encrypt = "Mandatory"
tenant_id = "${ENTRA_ID_TENANT}"
client_id = "${ENTRA_ID_CLIENT_ID}"
client_secret = "${ENTRA_ID_CLIENT_SECRET}" -
Add a Microsoft Fabric Data Source to your Data Context by executing the following code:
Pythondata_source = context.data_sources.add_fabric(
name=datasource_name,
host=host,
database=database,
schema=schema,
port=port,
encrypt=encrypt,
tenant_id=tenant_id,
client_id=client_id,
client_secret=client_secret,
) -
Decide whether you want to validate the records in a single table or the records returned by a SQL query.
- To validate the records in a single table, you will create a Table Data Asset.
- To validate the records returned by a SQL query, you will create a Query Data Asset. Note that Query Data Assets have some limitations compared to Table Data Assets.
- Table Data Asset
- Query Data Asset
-
Define your Table Data Asset's parameters.
The following information is required when you create a Microsoft Fabric Table Data Asset:
name: A name by which you can reference the Data Asset in the future. This should be unique within the Data Source.table_name: The name of the SQL table that the Table Data Asset will retrieve records from.
Pythondata_asset_name = "my_table_asset"
table_name = "my_table" -
Add the Data Asset to your Data Source. A new Data Asset is created and added to a Data Source simultaneously.
Pythontable_data_asset = data_source.add_table_asset(
table_name=table_name, name=data_asset_name
)
-
Define your Query Data Asset's parameters.
The following information is required when you create a Microsoft Fabric Query Data Asset:
name: A name by which you can reference the Data Asset in the future. This should be unique within the Data Source.query: The SQL query that the Data Asset will retrieve records from.
Pythondata_asset_name = "my_query_asset"
query = "SELECT * from my_table WHERE column1 = 'value' AND column2 > 20" -
Add the Data Asset to your Data Source. A new Data Asset is created and added to a Data Source simultaneously.
Pythonquery_data_asset = data_source.add_query_asset(query=query, name=data_asset_name)
import great_expectations as gx
context = gx.get_context(mode="cloud")
# Add a Microsoft Fabric Data Source
datasource_name = "Fabric"
host = "myworkspace.datawarehouse.fabric.microsoft.com"
database = "production"
schema = "sales"
port = 1433
encrypt = "Mandatory"
tenant_id = "${ENTRA_ID_TENANT}"
client_id = "${ENTRA_ID_CLIENT_ID}"
client_secret = "${ENTRA_ID_CLIENT_SECRET}"
data_source = context.data_sources.add_fabric(
name=datasource_name,
host=host,
database=database,
schema=schema,
port=port,
encrypt=encrypt,
tenant_id=tenant_id,
client_id=client_id,
client_secret=client_secret,
)
# Add a Table Data Asset
data_asset_name = "my_table_asset"
table_name = "my_table"
table_data_asset = data_source.add_table_asset(
table_name=table_name, name=data_asset_name
)
# Get the updated Data Source
data_source = context.data_sources.get(datasource_name)
# Add a Query Data Asset
data_asset_name = "my_query_asset"
query = "SELECT * from my_table WHERE column1 = 'value' AND column2 > 20"
query_data_asset = data_source.add_query_asset(query=query, name=data_asset_name)