Skip to main content
Version: 1.0.2

SparkDatasource

class great_expectations.datasource.fluent.SparkDatasource(*, type: Literal['spark'] = 'spark', name: str, id: Optional[uuid.UUID] = None, assets: List[great_expectations.datasource.fluent.spark_datasource.DataFrameAsset] = [], spark_config: Optional[Dict[pydantic.v1.types.StrictStr, Union[pydantic.v1.types.StrictStr, pydantic.v1.types.StrictInt, pydantic.v1.types.StrictFloat, pydantic.v1.types.StrictBool]]] = None, force_reuse_spark_context: bool = True, persist: bool = True)#

add_dataframe_asset(name: str, batch_metadata: Optional[BatchMetadata] = None) DataFrameAsset#

Adds a Dataframe DataAsset to this SparkDatasource object.

Parameters:
  • name – The name of the DataFrame asset. This can be any arbitrary string.

  • dataframe – The Spark Dataframe containing the data for this DataFrame data asset.

  • batch_metadata – An arbitrary user defined dictionary with string keys which will get inherited by any batches created from the asset.

Returns:

The DataFameAsset that has been added to this datasource.