Skip to main content

Databricks Pipes Client

class ascii_library.orchestration.pipes.databricks._PipesDatabricksClient

Bases: _PipesBaseCloudClient

Pipes client for Databricks.

  • Parameters:
    • client (WorkspaceClient) – A Databricks WorkspaceClient object.
    • tagging_client (ResourceGroupsTaggingAPIClient) – A Boto3 client for resource tagging.
    • context_injector (Optional *[*PipesContextInjector ]) – A context injector to use to inject context into the Databricks job. Defaults to a configured PipesDbfsContextInjector.
    • message_reader (Optional *[*PipesMessageReader ]) – A message reader to use to read messages from the Databricks job. Defaults to a configured PipesDbfsMessageReader.
    • forward_termination (bool) – If True, the Databricks job will be canceled if the orchestration process is interrupted. Defaults to True.
    • kwargs (Any) – Additional keyword arguments.

env : Mapping[str, str] | None = FieldInfo(annotation=NoneType, required=False, default=None, description='An optional dict of environment variables to pass to the subprocess.')

__init__(client, tagging_client, context_injector=None, message_reader=None, forward_termination=True, **kwargs)

  • Parameters:
    • client (WorkspaceClient)
    • tagging_client (ResourceGroupsTaggingAPIClient)
    • context_injector (PipesContextInjector | None)
    • message_reader (PipesMessageReader | None)
    • forward_termination (bool)

get_default_message_reader(task)

  • Parameters: task (SubmitTask)
  • Return type: PipesDbfsMessageReader

run(env, context, extras, task, submit_args, local_file_path, dbfs_path, libraries_to_build_and_upload=None)

Synchronously execute a Databricks job with the pipes protocol.

  • Parameters:
    • env (Optional *[*Mapping *[*str , str ] ]) – An optional dict of environment variables to pass.
    • context (OpExecutionContext) – The context from the executing op or asset.
    • extras (Optional *[*PipesExtras ]) – An optional dict of extra parameters to pass to the subprocess.
    • task (databricks.sdk.service.jobs.SubmitTask) – Specification of the Databricks task to run.
    • submit_args (Optional *[*Mapping *[*str , str ] ]) – Additional keyword arguments that will be forwarded as-is to WorkspaceClient.jobs.submit.
    • local_file_path (str) – The local path to the script to be executed.
    • dbfs_path (str) – The corresponding path on DBFS where the script will be uploaded.
    • libraries_to_build_and_upload (Optional *[*List *[*str ] ]) – A list of local Python packages to build and upload as libraries.
  • Returns: Wrapper containing results reported by the external process.
  • Return type: PipesClientCompletedInvocation