python-3.x - 启动 Bigquery 作业的数据流作业间歇性失败并出现错误“错误”:[ {“消息”:“已经存在:作业
问题描述
我每 6 分钟安排一次谷歌云数据流作业(使用 apache beam python sdk),它在内部从 Big Query Table 读取,进行一些转换并写入另一个 Big Query 表。此作业已开始间歇性失败(约 10 次中的 4 次)并出现以下错误跟踪。
2021-02-17 14:51:18.146 ISTError message from worker: Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/dataflow_worker/batchworker.py", line 649, in do_work
work_executor.execute()
File "/usr/local/lib/python3.8/site-packages/dataflow_worker/executor.py", line 225, in execute
self.response = self._perform_source_split_considering_api_limits(
File "/usr/local/lib/python3.8/site-packages/dataflow_worker/executor.py", line 233, in _perform_source_split_considering_api_limits
split_response = self._perform_source_split(source_operation_split_task,
File "/usr/local/lib/python3.8/site-packages/dataflow_worker/executor.py", line 271, in _perform_source_split
for split in source.split(desired_bundle_size):
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/gcp/bigquery.py", line 807, in split
self.table_reference = self._execute_query(bq)
File "/usr/local/lib/python3.8/site-packages/apache_beam/options/value_provider.py", line 135, in _f
return fnc(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/gcp/bigquery.py", line 851, in _execute_query
job = bq._start_query_job(
File "/usr/local/lib/python3.8/site-packages/apache_beam/utils/retry.py", line 236, in wrapper
return fun(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/gcp/bigquery_tools.py", line 459, in _start_query_job
response = self.client.jobs.Insert(request)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/gcp/internal/clients/bigquery/bigquery_v2_client.py", line 344, in Insert
return self._RunMethod(
File "/usr/local/lib/python3.8/site-packages/apitools/base/py/base_api.py", line 731, in _RunMethod
return self.ProcessHttpResponse(method_config, http_response, request)
File "/usr/local/lib/python3.8/site-packages/apitools/base/py/base_api.py", line 737, in ProcessHttpResponse
self.__ProcessHttpResponse(method_config, http_response, request))
File "/usr/local/lib/python3.8/site-packages/apitools/base/py/base_api.py", line 603, in __ProcessHttpResponse
raise exceptions.HttpError.FromResponse( apitools.base.py.exceptions.HttpConflictError: HttpError accessing <https://bigquery.googleapis.com/bigquery/v2/projects/bbb-erizo/jobs?alt=json>:
response: <{
'vary': 'Origin, X-Origin, Referer',
'content-type': 'application/json; charset=UTF-8',
'date': 'Wed, 17 Feb 2021 09:21:17 GMT',
'server': 'ESF',
'cache-control': 'private',
'x-xss-protection': '0',
'x-frame-options': 'SAMEORIGIN',
'x-content-type-options': 'nosniff',
'transfer-encoding': 'chunked',
'status': '409',
'content-length': '402',
'-content-encoding': 'gzip'
}>,
content <{
"error": {
"code": 409,
"message": "Already Exists: Job bbb-erizo:asia-northeast1.beam_bq_job_QUERY_AUTOMATIC_JOB_NAME_a2207822-8_754",
"errors": [ {
"message": "Already Exists: Job bbb-erizo:asia-northeast1.beam_bq_job_QUERY_AUTOMATIC_JOB_NAME_a2207822-8_754",
"domain": "global", "reason": "duplicate"
} ],
"status": "ALREADY_EXISTS"
}
} >
从错误跟踪来看,数据流作业为 BQ 生成的作业 ID 似乎是如何被复制的,但由于我没有明确分配 BQ 作业 ID,因为它是由 Dataflow 本身完成的,所以我对该部分没有任何控制权。
请推荐!!
解决方案
这是一个错误。它应该使用https://github.com/apache/beam/pull/13749修复,这将是 Beam 2.28.0 的一部分。
推荐阅读
- python - 如何创建一个具有不能被其子类继承的公共方法的类?
- jenkins - 如何在 Jenkins 管道中编写“Git LFS pull after checkout”设置
- cassandra - Cassandra 如何从 SStable 中检索数据并将其合并到 memetable 中?这些数据会再次刷新吗?
- c++ - 为犰狳矩阵赋值
- python - 如何使用 1d 数组搜索 2d 数组以返回 2d 数组的索引 1
- c# - 有没有办法从 Bot Framework 机器人发送 GroupMe 的图像或位置附件?
- javascript - 错误的链接,错误的地方
- angularjs - 如何在 Jasmine 中对 $uibModal 进行单元测试?(单元测试注入库)
- php - Laravel API JSON 请求更改属性名称
- php - 在 laravel 中将“历史数据”存储到数据库中的正确方法是什么?