首页 > 解决方案 > BigQuery - BATCH 模式查询随机失败,超出资源

问题描述

我们应用程序的一部分提交了 100 个 BQ 查询以构建聚合仪表板视图。由于查询量很大,我们以 BATCH 优先级提交所有查询。

在过去的 24 小时内,我们有很多查询(超过 300 个)由于资源超出而失败。这很奇怪,因为一些失败的查询专门将结果保存到输出表中以避免此类问题。我们抽查了其他几个,并且能够在 BQ Web UI 中毫无问题地运行它们。

示例作业 ID(无目标表):job_GPghMyEBvlYFBGBHI5Szv0HMIpMY

询问:

SELECT order_sub.value as value, count(*) as count
FROM
(
  SELECT order_id, drop_ship as value
  FROM (TABLE_QUERY(d34f20170905T212849Zc4abca, 'table_id IN ("order_detail_2018")'))
  WHERE order_date BETWEEN TIMESTAMP("2018-01-01") AND TIMESTAMP("2018-08-31")
) order_sub
JOIN EACH
(
  SELECT customer_id, order_id, sequence
  FROM temp_30_days.query_q6109c34f20180918T151206Ze52003_buyers
) cust_sub
ON order_sub.order_id = cust_sub.order_id
GROUP BY value
ORDER BY value

错误信息:

"status": {
  "state": "DONE",
  "errorResult": {
    "reason": "resourcesExceeded",
    "message": "Resources exceeded during query execution."
  },
  "errors": [
    {
      "reason": "resourcesExceeded",
      "message": "Resources exceeded during query execution."
    }
  ]
}

示例作业 ID(带有 dest 表):job_2YsCt-cxql5TP0jklb1aRENzINdZ

询问:

SELECT
  customer_id, sequence
FROM
(
  SELECT
    customer_id,
    sequence,
    random,
    MAX(random) OVER (PARTITION BY customer_id) AS max_value
  FROM
  (
    SELECT customer_id, sequence, RAND() as random
    FROM (TABLE_QUERY(d34f20170905T212849Zc4abca, 'table_id IN ("cust_char_2015", "cust_char_2016", "cust_char_2017")'))
    WHERE end_date >= TIMESTAMP("2016-12-31")
    AND effective_date <= TIMESTAMP ("2017-08-31")
    AND (frequency_end = "98")
  )
)
WHERE max_value = random
GROUP EACH BY
  customer_id, sequence

错误信息:

"status": {
  "errorResult": {
    "message": "Resources exceeded during query execution.", 
    "reason": "resourcesExceeded"
  },
  "errors": [
    {
      "message": "Resources exceeded during query execution.",
      "reason": "resourcesExceeded"
    }
  ],
  "state": "DONE"
}

FWIW status.cloud.google.com为 BigQuery 显示绿灯,还有其他人遇到类似问题吗?

标签: google-cloud-platformgoogle-bigquery

解决方案


推荐阅读