Spark Pipes Resource
class ascii_library.orchestration.pipes.spark_pipes.SparkPipesResource
Bases: ConfigurableResource
Generic configurable spark-pipes resource.
Executes jobs in one of several modes: local for quick development,
or on scalable cloud backends like databricks and emr.
Pipelines may optionally apply sampling to speed up end-to-end runs.
Databricks authentication (environment variables)
DATABRICKS_HOST
DATABRICKS_CLIENT_ID
DATABRICKS_CLIENT_SECRET
EMR credentials (environment variables)
ASCII_AWS_ACCESS_KEY_ID
ASCII_AWS_SECRET_ACCESS_KEY
- Variables:
- engine (Engine) – The default execution engine to use.
- execution_mode (ExecutionMode) – The execution mode for the pipeline (e.g.
debug,prod).