Authenticate access to Databricks using OAuth token federation
This article guides you through configuring OAuth federation to access Databricks account and workspace resources using tokens from your identity provider.
What is Databricks OAuth token federation?
Databricks OAuth token federation enables you to securely access Databricks APIs using tokens from your identity provider (IdP). OAuth token federation eliminates the need to manage Databricks secrets such as personal access tokens and Databricks OAuth client secrets.
Using Databricks OAuth token federation, users and service principals exchange JWT (JSON Web Tokens) tokens from your identity provider for Databricks OAuth tokens, which can then be used to access Databricks APIs.
Databricks supports two types of token federation:
- Account-wide token federation enables all users and service principals in your Databricks account to access Databricks APIs using tokens from your identity provider. Account-wide token federation allows you to centralize the management of token issuance policies in your identity provider, and is typically used in combination with SCIM, so users in your identity provider are synchronized into your Databricks account.
- Workload identity federation allows your automated workloads running outside of Databricks to access Databricks APIs without the need for Databricks secrets. With workload identity federation, your application (workload) authenticates to Databricks as a Databricks service principal using tokens issued by the workload runtime.
Account-wide token federation
Account admins can configure OAuth token federation in the Databricks account using an account federation policy. An account federation policy enables all users and service principals in your Databricks account to access Databricks APIs using tokens from your identity provider. An account federation policy specifies:
- The identity provider or issuer from which Databricks will accept tokens.
- The criteria for mapping a token to the corresponding Databricks user or service principal.
To configure an account federation policy, provide the following:
-
The required token issuer, specified in the
iss
claim of your tokens. The issuer is an HTTPS URL that identifies your identity provider. -
The allowed token audiences, specified in the
aud
claim of your tokens. This identifier represents the recipient of the token. As long as the audience in the token matches at least one audience in the policy, the token is considered a match. If unspecified, the default value is your Databricks account ID. -
The subject claim. This indicates which token claim contains the Databricks username of the user the token was issued for. If unspecified, the default value is
sub
. -
Optionally, the public keys used to validate the signature of your tokens, in JSON Web Key Sets (JWKS) format. If unspecified (recommended), Databricks automatically fetches the public keys from your issuer’s well known endpoint. Databricks strongly recommends relying on your issuer’s well known endpoint for discovering public keys.
noteIf you do not specify a JWKS in your federation policy (recommended), your identity provider must serve OpenID Provider Metadata at
{issuer-url}/.well-known/openid-configuration
. The OpenID Provider Metadata must include ajwks_uri
that specifies the location of the public keys used to verify token signatures.
The following is an example account federation policy:
issuer: "https://idp.mycompany.com/oidc"
audiences: ["databricks"]
subject_claim: "sub"
The following example JWT body matches the above policy and can be used to authenticate to Databricks as user username@mycompany.com
:
{
"iss": "https://idp.mycompany.com/oidc",
"aud": "databricks",
"sub": "username@mycompany.com"
}
Configure an account federation policy
Account admins can configure an account federation policy using the Databricks CLI (version 0.239.0 and above) or the Databricks API. You can specify up to five account federation policies in your Databricks account.
- Databricks CLI
- Databricks Account API
-
Install or update to the newest version of the Databricks CLI.
-
As an account admin, authenticate to your Databricks account using the CLI. Specifying the ACCOUNT_CONSOLE_URL (e.g.https://accounts.cloud.databricks.com) and your Databricks ACCOUNT_ID:
Bashdatabricks auth login --host ${ACCOUNT_CONSOLE_URL} --account-id ${ACCOUNT_ID}
-
Create the account federation policy. For example:
Bashdatabricks account federation-policy create --json \
'{
"oidc_policy": {
"issuer": "https://idp.mycompany.com/oidc",
"audiences": [
"databricks"
],
"subject_claim": "sub"
}
}'
The following is an example Databricks REST API call to create an account federation policy:
curl --request POST \
--header "Authorization: Bearer $TOKEN" \
"https://accounts.cloud.databricks.com/api/2.0/accounts/${ACCOUNT_ID}/federationPolicies" \
--data '{
"oidc_policy": {
"issuer": "https://idp.mycompany.com/oidc",
"audiences": [
"databricks"
],
"subject_claim": "sub"
}
}'
You might need to configure your identity provider to generate tokens for your users to exchange with Databricks. See the documentation for your identity provider for instructions.
Example account federation policies
Federation policy | Example matching token |
---|---|
issuer: "https://idp.mycompany.com/oidc" audiences: ["2ff814a6-3304-4ab8-85cb-cd0e6f879c1d"] | { "iss": "https://idp.mycompany.com/oidc", "aud": "2ff814a6-3304-4ab8-85cb-cd0e6f879c1d", "sub": "username@mycompany.com" } |
issuer: "https://idp.mycompany.com/oidc" audiences: ["2ff814a6-3304-4ab8-85cb-cd0e6f879c1d"] subject_claim: "preferred_username" | { "iss": "https://idp.mycompany.com/oidc", "aud": ["2ff814a6-3304-4ab8-85cb-cd0e6f879c1d", "other-audience"], "preferred_username": "username@mycompany.com", "sub": "some-other-ignored-value" } |
issuer: "https://idp.mycompany.com/oidc" audiences: ["2ff814a6-3304-4ab8-85cb-cd0e6f879c1d"] jwks_json: {"keys":[{"kty":"RSA","e":"AQAB","use":"sig", "kid":"<key-id>","alg":"RS256", "n":"uPUViFv..."}]} | { "iss": "https://idp.mycompany.com/oidc", "aud": "2ff814a6-3304-4ab8-85cb-cd0e6f879c1d", "sub": "username@mycompany.com" } (signature verified using public key in policy) |
issuer: "https://idp.mycompany.com/oidc" audiences: ["2ff814a6-3304-4ab8-85cb-cd0e6f879c1d"] jwks_json: {"keys":[{"kty":"RSA","e":"AQAB","use":"sig", "kid":"<key-id>","alg":"RS256", "n":"uPUViFv..."}]} | { "iss": "https://idp.mycompany.com/oidc", "aud": "2ff814a6-3304-4ab8-85cb-cd0e6f879c1d", "sub": "username@mycompany.com" } (signature verified using public key in policy) |
Workload identity federation
Workload identity federation allows your automated workloads running outside of Databricks to access Databricks APIs without the need for Databricks secrets. Account admins can configure workload identity federation using a service principal federation policy.
A service principal federation policy is associated with a service principal in your Databricks account, and specifies:
- The identity provider (or issuer) from which the service principal can authenticate.
- The workload identity (or subject) that is permitted to authenticate as the Databricks service principal.
To configure a service principal federation policy, provide the following:
-
The required token issuer, specified in the
iss
claim of workload identity tokens. The issuer is an HTTPS URL that identifies the workload identity provider. -
The required token subject, specified in the
sub
claim of workload identity tokens. The subject uniquely identifies the workload in the workload runtime environment. -
The allowed token audiences, specified in the
aud
claim of workload identity tokens. The audience represents the recipient of the token. As long as the audience in the token matches at least one audience in the policy, the token is considered a match. If unspecified, the default value is your Databricks account ID. -
Optionally, the public keys used to validate the signature of the workload identity tokens, in JSON Web Key Sets (JWKS) format. If unspecified (recommended), Databricks automatically fetches the public keys from the issuer’s well known endpoint. Databricks strongly recommends relying on the issuer’s well known endpoint for discovering public keys.
-
Optionally, the subject claim. This indicates which token claim contains the workload identity (or subject) of the token. If it is unspecified, the default value is
sub
. Databricks strongly recommends using the defaultsub
claim for workload identity federation. A claim other thansub
should only be used in cases where thesub
claim is not an appropriate or stable subject identifier, which is uncommon. See Example service principal federation policies below for details.noteIf you do not specify a JWKS in your federation policy (recommended), your identity provider must serve OpenID Provider Metadata at
{issuer-url}/.well-known/openid-configuration
. The OpenID Provider Metadata must include ajwks_uri
that specifies the location of the public keys used to verify token signatures.
The following is an example service principal federation policy for a Github Actions workload:
issuer: "https://token.actions.githubusercontent.com"
audiences: ["https://github.com/my-github-org"]
subject: "repo:my-github-org/my-repo:environment:prod"
The following example JWT body matches the above policy and can be used to authenticate to Databricks:
{
"iss": "https://token.actions.githubusercontent.com",
"aud": "https://github.com/my-github-org",
"sub": "repo:my-github-org/my-repo:environment:prod"
}
Configure a service principal federation policy
Account admins can configure a service principal federation policy using the Databricks CLI (version 0.239.0 and above) or the Databricks API. You can create up to five service principal federation policies per Databricks service principal.
- Databricks CLI
- Databricks Account API
-
Install or update to the newest version of the Databricks CLI.
-
As an account admin, authenticate to your Databricks account using the CLI. Specifying the ACCOUNT_CONSOLE_URL (e.g.https://accounts.cloud.databricks.com) and your Databricks ACCOUNT_ID:
Bashdatabricks auth login --host ${ACCOUNT_CONSOLE_URL} --account-id ${ACCOUNT_ID}
-
Create the service principal federation policy, specifying the service principal application ID (for example
3659993829438643
). The following is an example for a Github Actions workload:Bashdatabricks account service-principal-federation-policy create ${SERVICE_PRINCIPAL_ID} --json \
'{
"oidc_policy": {
"issuer": "https://token.actions.githubusercontent.com",
"audiences": [
"https://github.com/my-github-org"
],
"subject": "repo:my-github-org/my-repo:environment:prod"
}
}'
The following is an example Databricks REST API call to create a policy for a Github Actions workload:
curl --request POST \
--header "Authorization: Bearer $TOKEN" \
"https://accounts.cloud.databricks.com/api/2.0/accounts/${ACCOUNT_ID}/servicePrincipals/${SERVICE_PRINCIPAL_ID}/federationPolicies" \
--data '{
"oidc_policy": {
"issuer": "https://token.actions.githubusercontent.com",
"audiences": [
"https://github.com/my-github-org"
],
"subject": "repo:my-github-org/my-repo:environment:prod"
}
}'
Example Databricks account and service principal federation policies
Tool | Federation policy | Example matching token |
---|---|---|
GitHub Actions | issuer: "https://token.actions.githubusercontent.com" audiences: ["https://github.com/<github-org>"] subject: "repo:<github-org>/<repo>:environment:prod" | { "iss": "https://token.actions.githubusercontent.com", "aud": "https://github.com/<github-org>", "sub": "repo:<github-org>/<repo>:environment:prod" } |
Kubernetes | issuer: "https://kubernetes.default.svc" audiences: ["https://kubernetes.default.svc"] subject: "system:serviceaccount:namespace:podname" jwks json: {"keys":[{"kty":"rsa","e":"AQAB","use":"sig", "kid":"<key-id>","alg":"RS256","n":"uPUViFv..."}]} | { "iss": "https://kubernetes.default.svc", "aud": ["https://kubernetes.default.svc"], "sub": "system:serviceaccount:namespace:podname" } |
Azure DevOps | issuer: "https://vstoken.dev.azure.com/<org_id>" audiences: ["api://AzureADTokenExchange"] subject: "sc://my-org/my-project/my-connection" | { "iss": "https://vstoken.dev.azure.com/<org_id>", "aud": "api://AzureADTokenExchange", "sub": "sc://my-org/my-project/my-connection" } |
GitLab | issuer: "https://gitlab.example.com" audiences: ["https://gitlab.example.com"] subject: "project_path:my-group/my-project:..." | { "iss": "https://gitlab.example.com", "aud": "https://gitlab.example.com", "sub": "project_path:my-group/my-project:..." } |
CircleCI | issuer: "https://oidc.circleci.com/org/<org_id>" audiences: ["<org_id>"] subject: "7cc1d11b-46c8-4eb2-9482-4c56a910c7ce" subject_claim: "oidc.circleci.com/project-id" | { "iss": "https://oidc.circleci.com/org/<org_id>", "aud": "<org_id>", "oidc.circleci.com/project-id": "7cc1d11b-46c8-4eb2-9482-4c56a910c7ce" } |
After you have configured a federation policy for your account, you can use a JWT from your identity provider to access the Databricks API. To do so, first exchange a JWT token from your identity provider for a Databricks OAuth token, and then use the Databricks OAuth token in the Bearer:
field of the API call to gain access and complete the call. Tokens must be valid JWTs that are signed using the RS256 or ES256 algorithms.
For guidance on this process, see Use an identity provider token to authenticate to Databricks.