Skip to main content

Use Unity Catalog service credentials to connect to external cloud services

This article describes how to use a service credential in Unity Catalog to connect to external cloud services. A service credential object in Unity Catalog encapsulates a long-term cloud credential that provides access to an external cloud service that users need to connect to from Databricks.

See also:

Before you begin

Before you can use a service credential to connect to an external cloud service, you must have:

  • A Databricks workspace that is enabled for Unity Catalog.

  • A compute resource that is on Databricks Runtime 16.2 or above.

    SQL warehouses are not supported.

    The Public Preview version of service credentials is available on Databricks Runtime 15.4 LTS and above, with Python support but no Scala support.

  • A service credential created in your Unity Catalog metastore that gives access to the cloud service.

  • The ACCESS privilege on the service credential or ownership of the service credential.

Use a service credential in your code

This section provides examples of using service credentials in a notebook. Replace placeholder values. These examples don’t necessarily show the installation of required libraries, which depend on the client service you want to access.

Python example: Configure a boto3 session to use a specific service credential

Python
import boto3
boto3_session = boto3.Session(botocore_session=dbutils.credentials.getServiceCredentialsProvider('your-service-credential'), region_name='your-aws-region')
sm = boto3_session.client('secretsmanager')

Scala example: Configure an AWS Java SDK session to use a specific service credential

This example use a service credential to provide access to AWS S3 using the AWS Java SDK.

Scala
import com.amazonaws.auth.AWSCredentialsProvider
import com.amazonaws.services.s3.AmazonS3ClientBuilder
import com.amazonaws.services.s3.model.ListObjectsV2Request

import com.databricks.dbutils_v1.DBUtilsHolder
val dbutils = DBUtilsHolder.dbutils

// Obtain the AWS credentials provider. The asInstanceOf cast prevents a type mismatch
val awsCredentialsProvider = dbutils.credentials.getServiceCredentialsProvider("your-service-credential").asInstanceOf[AWSCredentialsProvider]

// Create an S3 client using the credentials provider
val s3Client = AmazonS3ClientBuilder.standard()
.withCredentials(awsCredentialsProvider)
.withRegion("us-east-1") // Specify your AWS region
.build()

// List objects in an S3 bucket
val bucketName = "your-bucket"
val request = new ListObjectsV2Request().withBucketName(bucketName)
val result = s3Client.listObjectsV2(request)

result.getObjectSummaries.forEach { summary =>
println(s" - ${summary.getKey}") }

Specify a default service credential for a compute resource

You can optionally specify a default service credential for an all-purpose or jobs compute cluster by setting an environment variable. By default, the SDK uses that service credential if no authentication is provided. Users still require ACCESS on that service credential to connect to the external cloud service. Databricks does not recommend this approach, because it makes your code less portable than naming the service credential in your code.

note

Serverless compute and SQL warehouses don’t support environment variables, and therefore they don’t support default service credentials.

  1. Open the edit page for the cluster.

    See Manage compute.

  2. Click Advanced at the bottom of the page and go to the Spark tab.

  3. Add the following entry in Environment variables, replacing <your-service-credential>:

    DATABRICKS_DEFAULT_SERVICE_CREDENTIAL_NAME=<your-service-credential>

The following code samples do not specify a service credential. Instead, they use the service credential specified in the DATABRICKS_DEFAULT_SERVICE_CREDENTIAL_NAME environment variable:

Python
import boto3
sm = boto3.client('secretsmanager', region_name='your-aws-region')

Compare this to the example in Python example: Configure a boto3 session to use a specific service credential, which adds the credential specification:

Python
boto3_session = boto3.Session(botocore_session=dbutils.credentials.getServiceCredentialsProvider('your-service-credential'), region_name='your-aws-region')