首页 > 解决方案 > 在 S3 中使用 hadoop 命令时需要哪些策略

问题描述

我想将on-pre中的数据复制到S3。

我尝试为此使用以下命令。

hadoop fs -Dfs.s3a.access.key=******* -Dfs.s3a.secret.key=******* -cp -f hdfs://on-pre/cluster/mydata/dt=20200601/ s3a://some-bucket/somewhere/

当我运行这个命令时,我得到以下错误(路径都是假的):

cp: s3a://some-bucket/somewhere/dt=20200601/000000_0.gz: getFileStatus on s3a://some-bucket/somewhere/dt=20200601/000000_0.gz: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xxxxxxxxxxxxx), S3 Extended Request ID: xxxxxxxxxxxxxxxxxxxxxxx

我设置了以下 S3 策略。

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::9999999999:user/john"
            },
            "Action": "s3:*",
            "Resource": "arn:aws:s3:::some-bucket/somewhere/*"
        },
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::9999999999:user/john"
            },
            "Action": "s3:List*",
            "Resource": "arn:aws:s3:::some-bucket",
            "Condition": {
                "StringLike": {
                    "s3:prefix": [
                        "somewhere/*"
                    ]
                }
            }
        }
    ]
}

我应该设置使用什么 S3 策略hadoop fs cp

标签: hadoopamazon-s3amazon-iam

解决方案


子目录本身需要 ListBucket

如评论中所述,必要的权限可能会更改。

但是,这是解决它的代码供您参考

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::9999999999:user/john"
            },
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:DeleteObject",
                "s3:AbortMultipartUpload"
            ],
            "Resource": [
                "arn:aws:s3:::some-bucket/somewhere/*"
            ]
        },
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::9999999999:user/john"
            },
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::some-bucket"
            ],
            "Condition": {
                "StringLike": {
                    "s3:prefix": [
                        "somewhere",
                        "somewhere/*"
                    ]
                }
            }
        },
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::9999999999:user/john"
            },
            "Action": [
                "s3:GetBucketLocation"
            ],
            "Resource": [
                "arn:aws:s3:::some-bucket"
            ]
        }
    ]
}

注意此 IAM 声明于 2020 年 6 月 5 日适用于 Hadoop 3.2.1 或更早版本。未来的 Hadoop 版本可能会更改规则,因为它们或 AWS 会分别更改连接器或 S3 的功能。


推荐阅读