Skip to main content
Version: 1.8.0

Configure Data Docs

Data Docs translate Expectations, Validation Results, and other metadata into human-readable documentation that is saved as static web pages. Automatically compiling your data documentation from your data tests in the form of Data Docs keeps your documentation current. This guide covers how to configure Data Docs.

Prerequisites:

Procedure

  1. Define a configuration dictionary for your new Data Docs site.

    GX writes Data Doc sites to a directory specified by the base_directory key of the configuration dictionary. Configuring other keys of the dictionary is not supported, and they may be removed in a future release.

    A local or networked filesystem Data Doc site requires the following store_backend information:

    • base_directory: A path to the folder where the static sites should be created. This can be an absolute path, or a path relative to the root folder of the Data Context.
    • class_name: This value must be TupleFilesystemStoreBackend, and is not user-configurable.

    To define a Data Docs site configuration for a local or networked filesystem environment, update the value of base_directory in the following code and execute it:

    Python
    base_directory = "uncommitted/data_docs/local_site/"  # this is the default path (relative to the root folder of the Data Context) but can be changed as required
    site_config = {
    "class_name": "SiteBuilder",
    "site_index_builder": {"class_name": "DefaultSiteIndexBuilder"},
    "store_backend": {
    "class_name": "TupleFilesystemStoreBackend",
    "base_directory": base_directory,
    },
    }
  2. Add your configuration to your Data Context.

    All Data Docs sites have a unique name within a Data Context. Once your Data Docs site configuration has been defined, add it to the Data Context by updating the value of site_name in the following to something more descriptive and then execute the code::

    Python
    site_name = "my_data_docs_site"
    context.add_data_docs_site(site_name=site_name, site_config=site_config)
  3. Optional. Build your Data Docs sites manually.

    You can manually build a Data Docs site by executing the following code:

    Python
    context.build_data_docs(site_names=site_name)
  4. Optional. Automate Data Docs site updates with Checkpoint Actions.

    You can automate the creation and update of Data Docs sites by including the UpdateDataDocsAction in your Checkpoints. This Action will automatically trigger a Data Docs site build whenever the Checkpoint it is included in completes its run() method.

    Python
    checkpoint_name = "my_checkpoint"
    validation_definition_name = "my_validation_definition"
    validation_definition = context.validation_definitions.get(validation_definition_name)
    actions = [
    gx.checkpoint.actions.UpdateDataDocsAction(
    name="update_my_site", site_names=[site_name]
    )
    ]
    checkpoint = context.checkpoints.add(
    gx.Checkpoint(
    name=checkpoint_name,
    validation_definitions=[validation_definition],
    actions=actions,
    )
    )

    result = checkpoint.run()
  5. Optional. View your Data Docs.

    Once your Data Docs have been created, you can view them with:

    Python
    context.open_data_docs()