Instantiate a Data Context on an EMR Spark cluster
Use the information provided here to instantiate a Data Context on an EMR Spark cluster without a full configuration directory.
Prerequisites
Install Great Expectations on your EMR Spark cluster
-
Copy this code snippet into a cell in your EMR Spark notebook and then run it:
Pythonsc.install_pypi_package("great_expectations")
Configure a Data Context in code
-
Create an in-code Data Context. See Instantiate an Ephemeral Data Context.
-
Copy the Python code at the end of How to instantiate an Ephemeral Data Context into a cell in your EMR Spark notebook, or use the other examples to customize your configuration. The code instantiates and configures a Data Context for an EMR Spark cluster.
Test your configuration
-
Execute the cell with the snippet you copied in the previous step.
-
Copy the code snippet into a cell in your EMR Spark notebook.
-
Run the following command to verify that an error isn't returned:
Python
context.list_datasources()