Connect GX Cloud to Redshift
Prerequisites
-
You have a GX Cloud account with Workspace Editor permissions or greater.
-
A Redshift database, schema, and table or view.
-
If you are using a fully-hosted deployment of GX Cloud, your Redshift cluster or workgroup must be publicly accessible.
-
A Redshift user with the following permissions:
-
USAGEprivileges on the schema. -
SELECTprivileges on the table or view.
-
Connect to a Redshift Data Source and add a Data Asset
-
In GX Cloud, select the relevant Workspace and then click Data Assets > New Data Asset > New Data Source > Redshift.
-
Enter a meaningful name for the Data Source in the Data Source name field.
-
Select whether you will enter your connection details as either separate Input parameters or a consolidated Connection string.
-
Supply your connection details depending on the method you chose in the previous step. If you created a separate Redshift user for your GX Cloud connection as recommended above, use those credentials in your connection details.
-
If you chose Input parameters, complete the following fields:
-
Username: Enter the username you use to access Redshift.
-
Password: Enter the password you use to access Redshift.
-
Host: Enter the host of your Redshift database. The location of this information in Redshift depends on whether you are using a provisioned cluster or Redshift serverless.
- If you're using a provisioned cluster, go to the Provisioned clusters dashboard, select your Cluster, and find the Endpoint. Copy the endpoint up to the
:. The host has a format ofcluster-name.abc123.us-east-2.redshift.amazonaws.com. - If you're using Redshift serverless, go to the Serverless dashboard, select your Workgroup, and find the Endpoint. Copy the endpoint up to the
:. The host has a format ofworkgroup-name.123.us-east-2.redshift-serverless.amazonaws.com.
- If you're using a provisioned cluster, go to the Provisioned clusters dashboard, select your Cluster, and find the Endpoint. Copy the endpoint up to the
-
Port: Enter the port of your Redshift database. The location of this information in Redshift depends on whether you are using a provisioned cluster or Redshift serverless.
- If you're using a provisioned cluster, go to the Provisioned clusters dashboard, select your Cluster, and find the Endpoint. Copy the number after the
:. This is usually the default of5439. - If you're using Redshift serverless, go to the Serverless dashboard, select your Workgroup, and find the Endpoint. Copy the number after the
:. This is usually the default of5439.
- If you're using a provisioned cluster, go to the Provisioned clusters dashboard, select your Cluster, and find the Endpoint. Copy the number after the
-
Database: Enter the name of the Redshift database where the data you want to validate is stored.
-
Schema: Enter the name of the Redshift schema where the data you want to validate is stored.
-
SSL mode: Select how to handle encryption for client connections and server certificate verification. We recommend selecting
requiresince GX Cloud supports SSL connections. See Redshift's SSL docs for more information on the available options.
-
-
If you chose Connection string, enter it with a format of:
Redshift connection stringredshift+psycopg2://<USER>:<PASSWORD>@<HOST>:<PORT>/<DATABASE>?sslmode=<SSLMODE>&options=-csearch_path%3D<SCHEMA>For guidance on replacing each placeholder in the connection string, see the above input parameter definitions.
-
-
Click Connect.
-
Select one or more tables or views to import as Data Assets.
-
Click Add x Asset(s).
-
Decide which Anomaly Detection options you want to enable. By default, GX Cloud adds warning-severity Expectations to detect Schema and Volume anomalies. You can de-select recommendations you’d like to opt out of. You can choose to generate Expectations to detect Completeness anomalies.
-
Click Start monitoring or Finish.