跨帐户IAM角色提供服务:Amazon S3;状态码:403;错误码:AccessDenied &g



你好,我正试图移动一个文件跨帐户,从桶accountA到桶accountB,我得到以下错误

调用o88.parquet时发生错误。dt/output1/parquet/_temporary/0/: PUT 0字节对象到dt/output1/parquet/_temporary/0/: com.amazonaws.services.s3.model。AmazonS3 exception: Access Denied (Service: AmazonS3;状态码:403;错误码:AccessDenied;请求ID: F99P5W0C8Q28BJ4R;S3扩展请求ID: VpFGWR9JR7r2yae9v8ezB7HAgJu0uuwn4v3mBAG8CaaJ2q0+sOVFGdxsZ1GzMXhAifSCtdxJ0OM=;代理:null), S3扩展请求ID: VpFGWR9JR7r2yae9v8ezB7HAgJu0uuwn4v3mBAG8CaaJ2q0+sOVFGdxsZ1GzMXhAifSCtdxJ0OM=:AccessDenied

在我的末尾有如下设置。

帐户A具有以下角色cross-accountA-sample-role,策略如下

{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:ListAllMyBuckets"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::*"
]
},
{
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::my-bucket"
},
{
"Effect": "Allow",
"Action": [
"s3:Get*",
"s3:Put*",
"s3:List*"
],
"Resource": "arn:aws:s3:::my-bucket/*"
}
]
}

账号A角色信任关系

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "s3.amazonaws.com"
},
"Action": "sts:AssumeRole"
},
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::{accountBId}:role/{accountBrole}"
},
"Action": "sts:AssumeRole"
}
]
}

Account B跨帐户角色

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::{accountAId}:role/{accountArole}"
}
]
}

编辑帐户B角色

附加的策略
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:*",
"s3-object-lambda:*"
],
"Resource": "*"
}
]
}

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"glue:*",
"s3:GetBucketLocation",
"s3:ListBucket",
"s3:ListAllMyBuckets",
"s3:GetBucketAcl",
"ec2:DescribeVpcEndpoints",
"ec2:DescribeRouteTables",
"ec2:CreateNetworkInterface",
"ec2:DeleteNetworkInterface",
"ec2:DescribeNetworkInterfaces",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeVpcAttribute",
"iam:ListRolePolicies",
"iam:GetRole",
"iam:GetRolePolicy",
"cloudwatch:PutMetricData"
],
"Resource": [
"*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:CreateBucket"
],
"Resource": [
"arn:aws:s3:::aws-glue-*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::aws-glue-*/*",
"arn:aws:s3:::*/*aws-glue-*/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::crawler-public*",
"arn:aws:s3:::aws-glue-*"
]
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": [
"arn:aws:logs:*:*:/aws-glue/*"
]
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateTags",
"ec2:DeleteTags"
],
"Condition": {
"ForAllValues:StringEquals": {
"aws:TagKeys": [
"aws-glue-service-resource"
]
}
},
"Resource": [
"arn:aws:ec2:*:*:network-interface/*",
"arn:aws:ec2:*:*:security-group/*",
"arn:aws:ec2:*:*:instance/*"
]
}
]
}

是非常冗余的访问,但是现在我不再关心这个了。

这是我的发现,也许不是一个理想的解决方案,但这对我来说是有效的。

在我看来,这个陷阱是AWS没有很好地解释的,如果有这样的解释,我不知道。对我来说有效的是创建一个策略,它从帐户a中承担角色,现在如果你看到下面我的策略,我从帐户a中承担角色。我的理解是,一旦我们承担角色,我们不需要做任何事情,将有一个API调用(内部)将为我访问bucket,因为附加到我的Glue Job的角色是从帐户a中承担角色。

我实际上需要做的更多的是让STS也从我的代码中假设调用,它授予我临时凭据,有了临时凭据,我必须更新底层Hadoop配置。注:截至目前IF胶水作业角色(在帐户B中)没有STS假设角色能力,胶水作业将失败。多亏了本文STS假定角色API,我才能够进行跨帐户S3访问。我希望这能节省某人的时间。

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
import boto3

sts_connection = boto3.client('sts')
response = sts_connection.assume_role(RoleArn='arn:aws:iam::account_id_here:role/my_role_assumed_from_accountA', RoleSessionName='GlueTenantASession',DurationSeconds=3600)
credentials = response['Credentials']

args = getResolvedOptions(sys.argv, ["JOB_NAME"])
sc = SparkContext()
sc._jsc.hadoopConfiguration().set('fs.s3a.aws.credentials.provider', 'org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider')
sc._jsc.hadoopConfiguration().set('fs.s3a.access.key', credentials['AccessKeyId'])
sc._jsc.hadoopConfiguration().set('fs.s3a.secret.key', credentials['SecretAccessKey'])
sc._jsc.hadoopConfiguration().set('fs.s3a.session.token', credentials['SessionToken'])

glueContext = GlueContext(sc)
spark = glueContext.spark_session

job = Job(glueContext)
job.init(args["JOB_NAME"], args)

# Script generated for node S3 bucket
data_frame = glueContext.create_dynamic_frame.from_options(
format_options={"multiline": False},
connection_type="s3",
format="parquet",
connection_options={"paths": ["s3a://path_to_bucket_in_other_account"]},
transformation_ctx="S3bucket_node1",
)
data_frame.show()

我的内联策略从帐户A承担角色

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource":"arn:aws:iam::account_id:role/my_role_assumed_from_accountA"
}
]
}

最新更新