Skip to main content

Stack CLI (legacy)

important

This documentation has been retired and might not be updated.

This information applies to legacy Databricks CLI versions 0.18 and below. Databricks recommends that you use newer Databricks CLI version 0.205 or above instead. See What is the Databricks CLI?. To find your version of the Databricks CLI, run databricks -v.

To migrate from Databricks CLI version 0.18 or below to Databricks CLI version 0.205 or above, see Databricks CLI migration.

Databricks CLI versions 0.205 and above do not support the stack CLI. Databricks recommends that you use the Databricks Terraform provider instead.

note

The stack CLI requires Databricks CLI 0.8.3 or above.

The stack CLI provides a way to manage a stack of Databricks resources, such as jobs, notebooks, and DBFS files. You can store notebooks and DBFS files locally and create a stack configuration JSON template that defines mappings from your local files to paths in your Databricks workspace, along with configurations of jobs that run the notebooks.

Use the stack CLI with the stack configuration JSON template to deploy and manage your stack.

You run Databricks stack CLI subcommands by appending them to databricks stack.

Bash
databricks stack --help
Usage: databricks stack [OPTIONS] COMMAND [ARGS]...

[Beta] Utility to deploy and download Databricks resource stacks.

Options:
-v, --version [VERSION]
--debug Debug Mode. Shows full stack trace on error.
--profile TEXT CLI connection profile to use. The default profile is
"DEFAULT".
-h, --help Show this message and exit.

Commands:
deploy Deploy a stack of resources given a JSON configuration of the stack
Usage: databricks stack deploy [OPTIONS] CONFIG_PATH
Options:
-o, --overwrite Include to overwrite existing workspace notebooks and DBFS
files [default: False]
download Download workspace notebooks of a stack to the local filesystem
given a JSON stack configuration template.
Usage: databricks stack download [OPTIONS] CONFIG_PATH
Options:
-o, --overwrite Include to overwrite existing workspace notebooks in the
local filesystem [default: False]

Deploy a stack to a workspace

This subcommand deploys a stack. See Stack setup to learn how to set up a stack.

Bash
databricks stack deploy ./config.json

Stack configuration JSON template gives an example of config.json.

Download stack notebook changes

This subcommand downloads the notebooks of a stack.

Bash
databricks stack download ./config.json

Examples

Stack setup

File structure of an example stack

Bash
tree
.
├── notebooks
| ├── common
| | └── notebook.scala
| └── config
| ├── environment.scala
| └── setup.sql
├── lib
| └── library.jar
└── config.json

This example stack contains a main notebook in notebooks/common/notebook.scala along with configuration notebooks in the notebooks/config folder. There is a JAR library dependency of the stack in lib/library.jar. config.json is the stack configuration JSON template of the stack. This is what is passed into the stack CLI for deployment of the stack.

Stack configuration JSON template

The stack configuration template describes the stack configuration.

Bash
cat config.json
JSON
{
"name": "example-stack",
"resources": [
{
"id": "example-workspace-notebook",
"service": "workspace",
"properties": {
"source_path": "notebooks/common/notebook.scala",
"path": "/Users/example@example.com/dev/notebook",
"object_type": "NOTEBOOK"
}
},
{
"id": "example-workspace-config-dir",
"service": "workspace",
"properties": {
"source_path": "notebooks/config",
"path": "/Users/example@example.com/dev/config",
"object_type": "DIRECTORY"
}
},
{
"id": "example-dbfs-library",
"service": "dbfs",
"properties": {
"source_path": "lib/library.jar",
"path": "dbfs:/tmp/lib/library.jar",
"is_dir": false
}
},
{
"id": "example-job",
"service": "jobs",
"properties": {
"name": "Example Stack CLI Job",
"new_cluster": {
"spark_version": "7.3.x-scala2.12",
"node_type_id": "i3.xlarge",
"aws_attributes": {
"availability": "SPOT"
},
"num_workers": 3
},
"timeout_seconds": 7200,
"max_retries": 1,
"notebook_task": {
"notebook_path": "/Users/example@example.com/dev/notebook"
}
}
}
]
}

Each job, workspace notebook, workspace directory, DBFS file, or DBFS directory is defined as a ResourceConfig. Each ResourceConfig that represent a workspace or DBFS asset contains a mapping from the file or directory where it exists locally (source_path) to where it would exist in the workspace or DBFS (path).

Stack configuration template schema outlines the schema for the stack configuration template.

Deploy a stack

You deploy a stack using the databricks stack deploy <configuration-file> command.

Bash
databricks stack deploy ./config.json

During stack deployment, the DBFS and workspace assets are uploaded to your Databricks workspace and jobs are created.

At stack deploy time, a StackStatus JSON file for the deployment is saved in the same directory as the stack configuration template with the name, adding deployed immediately before the .json extension: (for example, ./config.deployed.json). This file is used by the Stack CLI to keep track of past deployed resources on your workspace.

Stack status schema outlines the schema of a stack configuration.

important

Do not attempt to edit or move the stack status file. If you get any errors regarding the stack status file, delete the file and try the deployment again.

Bash
cat ./config.deployed.json
JSON
{
"cli_version": "0.8.3",
"deployed_output": [
{
"id": "example-workspace-notebook",
"databricks_id": {
"path": "/Users/example@example.com/dev/notebook"
},
"service": "workspace"
},
{
"id": "example-workspace-config-dir",
"databricks_id": {
"path": "/Users/example@example.com/dev/config"
},
"service": "workspace"
},
{
"id": "example-dbfs-library",
"databricks_id": {
"path": "dbfs:/tmp/lib/library.jar"
},
"service": "dbfs"
},
{
"id": "example-job",
"databricks_id": {
"job_id": 123456
},
"service": "jobs"
}
],
"name": "example-stack"
}

Data structures

In this section:

Stack configuration template schema

StackConfig

These are the outer fields of a stack configuration template. All fields are required.

Field NameTypeDescription
nameSTRINGThe name of the stack.
resourcesList of ResourceConfigAn asset in Databricks. Resources are related to three services (REST API namespaces): workspace, jobs, and dbfs.

ResourceConfig

The fields for each ResourceConfig. All fields are required.

Field NameTypeDescription
idSTRINGA unique ID for the resource. Uniqueness of ResourceConfig is enforced.
serviceResourceServiceThe REST API service that the resource operates on. One of: jobs,
workspace, or dbfs.
propertiesResourcePropertiesFields in this are different depending the the ResourceConfig service.

ResourceProperties

The properties of a resource by ResourceService. The fields are classified as those used or not used in a Databricks REST API. All the fields listed are required.

serviceFields from the REST API used in the Stack CLIFields used only in the Stack CLI
workspacepath: STRING- Remote workspace paths of notebooks or directories. (Ex. /Users/example@example.com/notebook)

object_type: Workspace API- Notebook object type. Can only be NOTEBOOK or DIRECTORY.
source_path: STRING- Local source path of workspace notebooks or directories. A relative path to the stack configuration template file or an absolute path in your filesystem.
jobsAny field in the settings or new_settings structure. The only field not required in the settings or new_settings structure but required for the stack CLI is:

name: STRING- Name of the job to be deployed. For purposes of not creating too many duplicate jobs, the Stack CLI enforces unique names in stack deployed jobs.
None.
dbfspath: STRING- Matching remote DBFS path. Must start with dbfs:/. (ex. dbfs:/this/is/a/sample/path)

is_dir: BOOL- Whether a DBFS path is a directory or a file.
source_path: STRING- Local source path of DBFS files or directories. A relative path to the stack config template file or an absolute path in your filesystem.

ResourceService

Each resource belongs to a specific service that aligns with the Databricks REST API. These are the services that are supported by the Stack CLI.

ServiceDescription
workspaceA workspace notebook or directory.
jobsA Databricks job.
dbfsA DBFS file or directory.

Stack status schema

StackStatus

A stack status file is created after a stack is deployed using the CLI. The top-level fields are:

Field NameTypeDescription
nameSTRINGThe name of the stack. This field is the same field as in StackConfig.
cli_versionSTRINGThe version of the Databricks CLI used to deploy the stack.
deployed_resourcesList of ResourceStatusThe status of each deployed resource. For each resource defined in StackConfig, a corresponding ResourceStatus is generated here.

ResourceStatus

Field NameTypeDescription
idSTRINGA stack-unique ID for the resource.
serviceResourceServiceThe REST API service that the resource operates on. One of: jobs,
workspace, or dbfs.
databricks_idDatabricksIdThe physical ID of the deployed resource. The actual schema depends on the type (service) of the resource.

DatabricksId

A JSON object whose field depends on the service.

ServiceField in JSONTypeDescription
workspacepathSTRINGThe absolute path of the notebook or directory in a Databricks workspace. Naming is consistent with the Workspace API.
jobsjob_idSTRINGThe job ID as shown in a Databricks workspace. This can be used to update jobs already deployed.
dbfspathSTRINGThe absolute path of the notebook or directory in a Databricks workspace. Naming is consistent with the DBFS API.