Secure Access to Kinesis Across Accounts Using IAM Roles with an AssumeRole Policy

In AWS you can set up cross-account access, so the computing in one account can access AWS services in another account. One way to grant access, described in Secure Access to S3 Buckets Using IAM Roles, is to grant an account direct access to services in another account. Another way to grant access to other services is to allow an account to assume a role in another account.

Consider AWS Account A with account number 111122223333 and AWS Account B with account number 444455556666. Account A is used when signing up with Databricks, so EC2 services are managed by this account. Account B is where Kinesis is running.

This topic provides the steps to configure Account A to use the AWS AssumeRole action to access Kinesis in Account B as a role in Account B. To enable this access you perform configuration in Account A and Account B, in the Databricks Admin Console, when you configure a Databricks cluster, and when you run a notebook that accesses Kinesis.

Requirements

AWS administrator access to IAM roles and policies in the AWS account of the Databricks deployment and the AWS account of the Kinesis service.

Step 1: Set up cross-account role in Kinesis account

  1. In your Kinesis AWS Account, go to the IAM service and click the Roles tab.

  2. Click Create role. In the Select type of trusted entity panel, click Another AWS Account. Paste in the Account ID for your Databricks AWS account, 111122223333. Optionally, you can specify an External ID but it is not required.

  3. Click Next: permissions and give this role permission to access Kinesis. You can provide your own JSON or use the AmazonKinesisFullAccess policy.

  4. Click Next: Review and give the role a name, for example KinesisCrossAccountRole.

  5. Click Create role. The list of roles displays.

  6. In the Roles list, click KinesisCrossAccountRole and verify the trusted account contains a JSON policy like:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": [
              "arn:aws:iam::111122223333:root"
            ],
            "Service": "ec2.amazonaws.com"
          },
          "Action": "sts:AssumeRole"
        }
      ]
    }
    
  7. Copy the role ARN, for example: arn:aws:iam::444455556666:role/KinesisCrossAccountRole.

Step 2: Set up Assume Role in Databricks deployment account

  1. In your Databricks deployment AWS account, go to the IAM service and click the Roles tab.

  2. Click Create role. In the Select type of trusted entity panel, click AWS service and click the EC2 service.

  3. Click Next: Permissions.

  4. Click Next: Review and give the role a name, for example DatabricksToKinesisAssumeRole.

  5. Click Create role. The list of roles displays.

  6. In the Roles list, click DatabricksToKinesisAssumeRole.

  7. In the Permissions tab, click Inline policy.

  8. Click the JSON tab.

  9. Copy this policy and paste in the role ARN of your KinesisCrossAccountRole from Step 1 in the Resource field:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "Stmt1487884001000",
          "Effect": "Allow",
          "Action": [
           "sts:AssumeRole"
          ],
          "Resource": [
           "arn:aws:iam::444455556666:role/KinesisCrossAccountRole"
          ]
        }
      ]
    }
    
  10. Click Review policy.

  11. In the Name field, type a policy name, for example DatabricksToKinesisAssumeRole.

  12. Click Create policy.

  13. Select DatabricksToKinesisAssumeRole. Save the instance profile ARN for use in Step 3, and save the role ARN for use in the next step.

  14. Update the policy for the ManagerRole in this account and add the iam:PassRole action to the policy. The iam:PassRole action should use the role ARN for your DatabricksToKinesisAssumeRole that you just created in this step. After saving, you should have a policy like this:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "Stmt1403287045000",
          "Effect": "Allow",
          "Action": [
           "ec2:AssociateDhcpOptions",
           "ec2:AssociateRouteTable",
           "ec2:AttachInternetGateway",
           "ec2:AttachVolume"
          ],
          "Resource": [
             "*"
          ]
        },
        {
          "Effect": "Allow",
          "Action": "iam:PassRole",
          "Resource": [
           "arn:aws:iam::111122223333:role/DatabricksToKinesisAssumeRole"
          ]
        }
      ]
    }
    

Step 3: Add the role DatabricksToKinesisAssumeRole to Databricks

  1. In the Databricks Admin Console, go to the IAM roles tab.
  2. Add the IAM role DatabricksToKinesisAssumeRole using the instance profile ARN you saved in Step 2, arn:aws:iam::111122223333:instance-profile/DatabricksToKinesisAssumeRole.

Step 4: Create a cluster with IAM role

  1. In the Databricks UI of the same workspace, create a cluster.
  2. On the Instances tab, select the IAM role you added in Step 3 DatabricksToKinesisAssumeRole.
  3. Start the cluster.

Step 5: Validate connection to Kinesis

  1. Create a notebook and attach it to the cluster you created in Step 4.

  2. Spark’s readStream needs the roleArn option to tell it to use the specified assume role (KinesisCrossAccountRole) from the IAM role you attached to your cluster (DatabricksToKinesisAssumeRole). Paste in the following code to test the connection to your Kinesis stream:

    kinesis = spark.readStream \
      .format("kinesis") \
      .option("streamName", "testStream") \
      .option("stsEndpoint", "sts.us-east-1.amazonaws.com") \
      .option("region", "us-east-1") \
      .option("roleArn", "arn:aws:iam::444455556666:role/KinesisCrossAccountRole") \
      .option("initialPosition", "earliest") \
      .load()
    
    display(kinesis)
    

    The code sample uses Python and assumes the Kinesis stream’s name is testStream in AWS, that it resides in the us-east-1 region, and that the STS endpoint is sts.us-east-1.amazonaws.com. If you created KinesisCrossAccountRole with an External ID, add the following option:

    .option("roleExternalId", "myExternalCode")
    
  3. Check for a valid result like the following:

    ../../../_images/iam-kinesis.png