amazon-web-services - 运行 AWS 胶水工作室 ETL 脚本时出现 ARN 角色授权错误
问题描述
py4j.protocol.Py4JJavaError: An error occurred while calling o85.getDynamicFrame.
: java.sql.SQLException: Exception thrown in awaitResult:
at com.databricks.spark.redshift.JDBCWrapper.com$databricks$spark$redshift$JDBCWrapper$$executeInterruptibly(RedshiftJDBCWrapper.scala:133)
at com.databricks.spark.redshift.JDBCWrapper.executeInterruptibly(RedshiftJDBCWrapper.scala:109)
at com.databricks.spark.redshift.RedshiftRelation.buildScan(RedshiftRelation.scala:138)
at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$10.apply(DataSourceStrategy.scala:293)
at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$10.apply(DataSourceStrategy.scala:293)
at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:326)
at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:325)
at org.apache.spark.sql.execution.datasources.DataSourceStrategy.pruneFilterProjectRaw(DataSourceStrategy.scala:381)
at org.apache.spark.sql.execution.datasources.DataSourceStrategy.pruneFilterProject(DataSourceStrategy.scala:321)
at org.apache.spark.sql.execution.datasources.DataSourceStrategy.apply(DataSourceStrategy.scala:289)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:63)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:63)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:78)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:75)
at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157)
at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1334)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:75)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:67)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:72)
at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:68)
at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:77)
at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:77)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3359)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2544)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2758)
at com.amazonaws.services.glue.JDBCDataSource.getLastRow(DataSource.scala:944)
at com.amazonaws.services.glue.JDBCDataSource.getJdbcJobBookmark(DataSource.scala:805)
at com.amazonaws.services.glue.JDBCDataSource.getDynamicFrame(DataSource.scala:829)
at com.amazonaws.services.glue.DataSource$class.getDynamicFrame(DataSource.scala:94)
at com.amazonaws.services.glue.SparkSQLDataSource.getDynamicFrame(DataSource.scala:658)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: [Amazon](500310) Invalid operation: Not authorized to get credentials of role arn:aws:iam::**********:role/glue_etl_role
Details:
-----------------------------------------------
error: Not authorized to get credentials of role arn:aws:iam::*********:role/glue_etl_role
code: 30000
context:
query: 0
location: xen_aws_credentials_mgr.cpp:391
process: padbmaster
我们运行 AWS 胶水工作室脚本来执行一些连接和重命名操作。连接器和目标是使用 AWS 粘合目录的 Redshift。
起初的错误是 IAM 没有添加到我们添加的 redshift 中。添加 IAM 后,我们收到了这个新错误,上面写着 Not authorized to get credentials。
解决方案
AWS Glue 作业需要具有访问数据存储权限的 IAM 角色。确保此角色有权访问您的 Amazon S3 源、目标、临时目录、脚本和作业使用的任何库。
另一个 IAM 角色与 Redshift 集群相关联(因此集群可以代表您访问其他 AWS 服务,例如,如果您的表链接到 S3 存储桶,您必须为其授予权限AmazonS3ReadOnlyAccess)。
确保我们不会混合这两个角色。
推荐阅读
- c# - 如何使用 System.IdentityModel.Tokens.Jwt 包提取令牌过期时间
- zap - “extractWebSession”方法何时在 ZAP 会话管理脚本中执行
- javafx - JavaFX LineChart 上 X 轴下的堆叠数据
- ios - 如何在桌面上模拟 iPhone 进行 React Web 开发?
- javascript - Mongo在nodejs中按id合并文档
- python-3.x - Twinfield 的自动化在 Excel 中添加执行,最好使用 Python 库
- kotlin - Android Kotlin:如何获取 Toast 的上下文?
- android - 在 Windows 7 中创建 Android 虚拟设备
- php - 移动浏览器显示错误的站点
- python - Python:对数据框的响应对象