How to convert an Ephemeral Data Context to a Filesystem Data Context
An Ephemeral Data Context is a temporary, in-memory Data Context that will not persist beyond the current Python session. However, if you decide you would like to save the contents of an Ephemeral Data Context for future use you can do so by converting it to a Filesystem Data Context.
- A working installation of Great Expectations
- An Ephemeral Data Context instance
If you still need to set up and install GX...
If you still need to create a Data Context...
get_context() method will return an Ephemeral Data Context if your system is not set up to work with GX Cloud and a Filesystem Data Context cannot be found. For more information, see:
You can also instantiate an Ephemeral Data Context (for those occasions when your system is set up to work with GX Cloud or you do have a previously initialized Filesystem Data Context). For more information, see:
If you aren't certain that your Data Context is Ephemeral...
You can easily check to see if you are working with an Ephemeral Data Context with the following code (in this example, we are assuming your Data Context is stored in the variable
from great_expectations.data_context import EphemeralDataContext
if isinstance(context, EphemeralDataContext):
1. Verify that your current working directory does not already contain a GX Filesystem Data Context
The method for converting an Ephemeral Data Context to a Filesystem Data Context initializes the new Filesystem Data Context in the current working directory of the Python process that is being executed. If a Filesystem Data Context already exists at that location, the process will fail.
You can determine if your current working directory already has a Filesystem Data Context by looking for a
great_expectations.yml file. The presence of that file indicates that a Filesystem Data Context has already been initialized in the corresponding directory.
2. Convert the Ephemeral Data Context into a Filesystem Data Context
Converting an Ephemeral Data Context into a Filesystem Data Context can be done with one line of code:
context = context.convert_to_file_context()
convert_to_file_context() method does not change the Ephemeral Data Context itself. Rather, it initializes a new Filesystem Data Context with the contents of the Ephemeral Data Context and then returns an instance of the new Filesystem Data Context. If you do not replace the Ephemeral Data Context instance with the Filesystem Data Context instance, it will be possible for you to continue using the Ephemeral Data Context.
If you do this, it is important to note that changes to the Ephemeral Data Context will not be reflected in the Filesystem Data Context. Moreover,
convert_to_file_context() does not support merge operations. This means you will not be able to save any additional changes you have made to the content of the Ephemeral Data Context. Neither will you be able to use
convert_to_file_context() to replace the Filesystem Data Context you had previously created:
convert_to_file_context() will fail if a Filesystem Data Context already exists in the current working directory.
For these reasons, it is strongly advised that once you have converted your Ephemeral Data Context to a Filesystem Data Context you cease working with the Ephemeral Data Context instance and begin working with the Filesystem Data Context instance instead.
Customizing configurations in a Data Context
While some source data systems provide their own means of configuring credentials through environment variables, you can also configure GX to populate credentials from either a YAML file or a secret manager. For more information, please see:
Configuring Expectation Stores
Configuring Validation Results Stores
Configuring Metric Stores
Configuring Data Docs
Connecting GX to source data systems
Connecting GX to filesystem source data
- How to quickly connect to a single file using Pandas
- How to connect to one or more files using Pandas
- How to connect to one or more files using Spark
Google Cloud Storage
Azure Blob Storage
- How to connect to data on Azure Blob Storage using Pandas
- How to connect to data on Azure Blob Storage using Spark
Amazon Web Services
Connecting GX to in-memory source data
Connecting GX to SQL source data
General SQL Datasources
Specific SQL dialects