首页 > 解决方案 > DataflowRunner pipeline error - Unable to rename

问题描述

My DataFlow job reads one CSV file from GS bucket, query another service for extra data and writing it to a new CSV file and storing back to the bucket but it seems to fall before it grabs the input CSV file at the start...

This is the error I get: DataflowRuntimeException - Dataflow pipeline failed. State: FAILED, Error: Unable to rename "gs://../../job.1582402027.233469/dax-tmp-2020-02-22_12_07_49-5033316469851820576-S04-0-1719661b275ca435/tmp-1719661b275ca2ea-shard--try-273280d77b2c5b79-endshard.avro" to "gs://../../temp/job.1582402027.233469/tmp-1719661b275ca2ea-00000-of-00001.avro".

Any ideas what is the cause for this error?

here is a print screen

标签: google-cloud-dataflow

解决方案


通常,该错误是由于您在 DataFlow 作业中使用的服务帐户没有正确的 GCS(Google 云存储)权限。

您应该向服务帐户添加类似“ roles/storage.objectAdmin ”的角色,以允许与 GCS 交互。


推荐阅读