AWS Account

To set up Databricks you must grant Databricks permission to access an AWS account in which it will create and manage compute and VPC resources. Databricks can use either a cross-account role or access keys. This topic describes how to configure Databricks to use either method. For both methods you configure settings in both the AWS Console and the Databricks Account Console.

Important

  • Although both roles and access keys are supported, Databricks strongly recommends that you use a cross-account role to enable access to your AWS account.
  • You can change the AWS account. However, changing the AWS account causes cluster termination, VPC deletion, and the invalidation of any IAM roles you have set up.
  • Changes to the AWS account or to the type and configuration of AWS permissions can result in a downtime of 2-10 minutes.

Use a cross-account role

This section describes how to configure access to an AWS account using a cross-account role.

Step 1: Configure Databricks to use a cross-account role

  1. As the Databricks account owner, log in to the Account Console.

  2. Click the AWS Account tab.

  3. Select the Deploy to AWS using Cross Account Role radio button.

    ../../_images/iam-role-empty.png
  4. If this is the first time you are configuring the account, in the AWS Region drop-down, select an AWS region.

  5. Copy the External ID.

Step 2: Create a cross-account role and an access policy

  1. In the AWS Console, go to the IAM service.

  2. Click the Roles tab in the sidebar.

  3. Click Create role.

    1. In Select type of trusted entity, click the Another AWS account tile.

      ../../_images/trusted-entity.png
    2. In the Account ID field, enter the Databricks account ID 414351767826.

    3. Select the Require external ID checkbox.

    4. In the External ID field, paste the Databricks External ID you copied in Step 1.

    5. Click the Next: Permissions button.

    6. Click the Next: Tags button.

    7. Click the Next: Review button.

    8. In the Role name field, enter a role name.

      ../../_images/role-name.png
    9. Click Create role. The list of roles displays.

  4. In the list of roles, click the role you created.

  5. Add an inline policy.

    1. On the Permissions tab, click Add inline policy.

      ../../_images/inline-policy.png
    2. In the policy editor, click the JSON tab.

      ../../_images/policy-editor.png
    3. Paste this access policy into the editor:

      {
        "Version": "2012-10-17",
        "Statement": [
          {
            "Sid": "Stmt1403287045000",
            "Effect": "Allow",
            "Action": [
                "ec2:AssociateDhcpOptions",
                "ec2:AssociateIamInstanceProfile",
                "ec2:AssociateRouteTable",
                "ec2:AttachInternetGateway",
                "ec2:AttachVolume",
                "ec2:AuthorizeSecurityGroupEgress",
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:CancelSpotInstanceRequests",
                "ec2:CreateDhcpOptions",
                "ec2:CreateInternetGateway",
                "ec2:CreateKeyPair",
                "ec2:CreatePlacementGroup",
                "ec2:CreateRoute",
                "ec2:CreateSecurityGroup",
                "ec2:CreateSubnet",
                "ec2:CreateTags",
                "ec2:CreateVolume",
                "ec2:CreateVpc",
                "ec2:CreateVpcPeeringConnection",
                "ec2:DeleteInternetGateway",
                "ec2:DeleteKeyPair",
                "ec2:DeletePlacementGroup",
                "ec2:DeleteRoute",
                "ec2:DeleteRouteTable",
                "ec2:DeleteSecurityGroup",
                "ec2:DeleteSubnet",
                "ec2:DeleteTags",
                "ec2:DeleteVolume",
                "ec2:DeleteVpc",
                "ec2:DescribeAvailabilityZones",
                "ec2:DescribeIamInstanceProfileAssociations",
                "ec2:DescribeInstanceStatus",
                "ec2:DescribeInstances",
                "ec2:DescribePlacementGroups",
                "ec2:DescribePrefixLists",
                "ec2:DescribeReservedInstancesOfferings",
                "ec2:DescribeRouteTables",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSpotInstanceRequests",
                "ec2:DescribeSpotPriceHistory",
                "ec2:DescribeSubnets",
                "ec2:DescribeVolumes",
                "ec2:DescribeVpcs",
                "ec2:DetachInternetGateway",
                "ec2:DisassociateIamInstanceProfile",
                "ec2:ModifyVpcAttribute",
                "ec2:ReplaceIamInstanceProfileAssociation",
                "ec2:RequestSpotInstances",
                "ec2:RevokeSecurityGroupEgress",
                "ec2:RevokeSecurityGroupIngress",
                "ec2:RunInstances",
                "ec2:TerminateInstances"
            ],
            "Resource": [
              "*"
            ]
          },
          {
            "Effect": "Allow",
            "Action": [
              "iam:CreateServiceLinkedRole",
              "iam:PutRolePolicy"
            ],
            "Resource": "arn:aws:iam::*:role/aws-service-role/spot.amazonaws.com/AWSServiceRoleForEC2Spot",
            "Condition": {
              "StringLike": {
                "iam:AWSServiceName": "spot.amazonaws.com"
              }
            }
          }
        ]
      }
      
    4. Click Review policy.

    5. In the Name field, enter a policy name.

    6. Click Create policy.

  6. In the role summary, copy the Role ARN.

    ../../_images/role-arn.png

Step 3: Configure the cross-account role in your Databricks account

  1. In the Databricks Account Console, return to the AWS Account tab. The process varies depending on whether you are configuring your authentication for the first time or changing the authentication method.
  • First

    1. In the Role ARN field, paste the Role ARN you copied in Step 2.

      ../../_images/role-arn-first.png
    2. Click Next Step.

  • Change

    1. In the Role ARN field, paste the Role ARN you copied in Step 2.

      ../../_images/apply-changes.png
    2. Click Apply Changes.

      Warning

      If you provide a different AWS account ID than used to originally set up the account, a warning displays about the effect of that change, including cluster termination, VPC deletion, and the invalidation of any IAM roles you have set up. To proceed, click Change AWS Account.

Use access keys

This section describes how to configure access to an AWS account using access keys.

Step 1: Create an access policy and a user with access keys

  1. In the AWS Console, go to the IAM service.

  2. Click the Policies tab in the sidebar.

  3. Click Create policy.

    1. In the policy editor, click the JSON tab.

      ../../_images/policy-editor.png
    2. Paste this access policy into the editor:

      {
        "Version": "2012-10-17",
        "Statement": [
          {
            "Sid": "Stmt1403287045000",
            "Effect": "Allow",
            "Action": [
                "ec2:AssociateDhcpOptions",
                "ec2:AssociateIamInstanceProfile",
                "ec2:AssociateRouteTable",
                "ec2:AttachInternetGateway",
                "ec2:AttachVolume",
                "ec2:AuthorizeSecurityGroupEgress",
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:CancelSpotInstanceRequests",
                "ec2:CreateDhcpOptions",
                "ec2:CreateInternetGateway",
                "ec2:CreateKeyPair",
                "ec2:CreatePlacementGroup",
                "ec2:CreateRoute",
                "ec2:CreateSecurityGroup",
                "ec2:CreateSubnet",
                "ec2:CreateTags",
                "ec2:CreateVolume",
                "ec2:CreateVpc",
                "ec2:CreateVpcPeeringConnection",
                "ec2:DeleteInternetGateway",
                "ec2:DeleteKeyPair",
                "ec2:DeletePlacementGroup",
                "ec2:DeleteRoute",
                "ec2:DeleteRouteTable",
                "ec2:DeleteSecurityGroup",
                "ec2:DeleteSubnet",
                "ec2:DeleteTags",
                "ec2:DeleteVolume",
                "ec2:DeleteVpc",
                "ec2:DescribeAvailabilityZones",
                "ec2:DescribeIamInstanceProfileAssociations",
                "ec2:DescribeInstanceStatus",
                "ec2:DescribeInstances",
                "ec2:DescribePlacementGroups",
                "ec2:DescribePrefixLists",
                "ec2:DescribeReservedInstancesOfferings",
                "ec2:DescribeRouteTables",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSpotInstanceRequests",
                "ec2:DescribeSpotPriceHistory",
                "ec2:DescribeSubnets",
                "ec2:DescribeVolumes",
                "ec2:DescribeVpcs",
                "ec2:DetachInternetGateway",
                "ec2:DisassociateIamInstanceProfile",
                "ec2:ModifyVpcAttribute",
                "ec2:ReplaceIamInstanceProfileAssociation",
                "ec2:RequestSpotInstances",
                "ec2:RevokeSecurityGroupEgress",
                "ec2:RevokeSecurityGroupIngress",
                "ec2:RunInstances",
                "ec2:TerminateInstances"
            ],
            "Resource": [
              "*"
            ]
          },
          {
            "Effect": "Allow",
            "Action": [
              "iam:CreateServiceLinkedRole",
              "iam:PutRolePolicy"
            ],
            "Resource": "arn:aws:iam::*:role/aws-service-role/spot.amazonaws.com/AWSServiceRoleForEC2Spot",
            "Condition": {
              "StringLike": {
                  "iam:AWSServiceName": "spot.amazonaws.com"
              }
            }
          }
        ]
      }
      
    3. Click Review policy.

    4. In the Name field, enter a policy name.

    5. Click Create policy.

  4. Click the Users tab in the sidebar.

  5. Click Add User.

    1. Enter a user name.

    2. For Access type, select Programmatic access.

      ../../_images/user-access.png
    3. Click Next Permissions.

    4. Select Attach existing policies directly.

      ../../_images/set-user-permissions.png
    5. In the Policy type filter, select Customer managed.

      ../../_images/customer-managed-policy.png
    6. Select the checkbox next to the policy you created.

    7. Click Next Review.

    8. Click Create user.

  6. Click Download .csv, downloads a CSV file containing the access key ID and secret access key you need for the next step.

    ../../_images/download-csv.png
  7. Click Close.

Step 2: Configure access keys in your Databricks account

  1. As the Databricks account owner, log in to the Account Console.
  2. Click the AWS Account tab.
  3. Select the Deploy to AWS using Access Key radio button. The process varies depending on whether you are configuring your authentication for the first time or changing the authentication method.
  • First

    1. In the AWS Region drop-down, select an AWS region.

    2. In the AWS Account ID field, enter your AWS account ID. See AWS Account Identifiers for information on how to find your account ID.

    3. Enter your access key ID and secret access key from the CSV file you downloaded in Step 1 into their respective fields.

      ../../_images/aws-key-first.png
    4. Click Next Step.

  • Change

    1. Click the Edit AWS Settings button.

    2. Optionally enter a new AWS account ID.

    3. Enter your access key ID and secret access key from the CSV file you downloaded in Step 1 into their respective fields.

      ../../_images/aws-key.png
    4. Click Apply Changes.

      Warning

      If you provide a different AWS account ID than used to originally set up the account, a warning displays about the effect of that change, including cluster termination, VPC deletion, and the invalidation of any IAM roles you have set up. To proceed, click Change AWS Account.

Troubleshooting

  • Check that AWS account billing information is complete. If it is not, Databricks may not be able to validate credentials.
  • To confirm that there are no issues with your AWS account, verify that you can launch an EC2 instance.
  • It can take a minute or two for IAM to propagate the permissions set on the IAM role or user. If validation fails, retry the configuration in the Databricks account.

Next step

The first time you set up your account you must configure your AWS storage. See AWS Storage.