首页 > 解决方案 > Firestore 每次提交批量插入 375 个文档,但不是 500 个文档。为什么?

问题描述

我正在尝试使用以下代码通过云功能(超时540秒)在我的firestore数据库中插入更多的1400个对象:


...

const response = await fetch(url)
if (response.ok) {
    const json = await response.json()
    if (json.hasOwnProperty('data')) {
        const teams = json[`data`]
        
        var players = teams.flatMap((team) => {
            return team.squad.data
        })
        
        var playersBatch = []
        while (players.length > 0) {
            const playerBatch = players.splice(0, 375)
            playersBatch.push(playerBatch)
        }

        for (playerBatch of playersBatch) {
            const batch = database.batch()

            for (player of playerBatch) {
                const reference = database
                    .collection(`players`)
                    .doc(`${player.player_id}`)

                batch.set(reference, player, { merge: true })
            }

            await batch.commit()
        }
    } else {
        ...
    }
} else {
    ...
}

...

上面的代码对我有用,但是当我每批插入 375 个文档时工作,当我尝试插入 500 个文档时,批量提交在第一个循环中不起作用并给我一个超时异常。

函数执行耗时 540005 毫秒,完成状态为:'timeout'

批处理可以产生超时吗?插入大文件时批处理有任何限制吗?为什么我每次可以插入 375 但不能插入 500?

标签: javascriptfirebasegoogle-cloud-platformgoogle-cloud-firestoregoogle-cloud-functions

解决方案


如果我理解您正确执行的操作,则这些是您尝试执行的步骤:

  1. 获取网址
  2. 对于响应中的每个团队,提取所有玩家的列表
  3. 对于每个玩家,更新他们在数据库中的数据

稍微改组您的代码并await batch.commit()移出 for 循环(这使您的代码在移动到下一个之前等待每个批次完成),给出:

const response = await fetch(url)

if (!response.ok || response.status === 204) {
  // A 204 code will break response.json() with a parsing error
  // You might want to check for the 429 status here
  throw new Error(`Unexpected status code ${response.status}!`)
}

const json = await response.json() // note: empty bodies will throw a parsing error

if (!json.hasOwnProperty("data")) {
  throw new Error(`Unexpected response body!`, json)
}

const teams = json["data"]

const players = teams.flatMap((team) => {
    return team.squad.data // the array of players in this team
})

const playersInBatches = [];
while (players.length > 0) {
    const thisPlayerBatch = players.splice(0, 500)
    playersInBatches.push(thisPlayerBatch)
}

const batches = playersInBatches.map((playersInThisBatch) => {
    const dbBatch = database.batch()

    for (let player of playersInThisBatch) {
        const reference = database
            .collection("players")
            .doc(`${player.player_id}`)

        dbBatch.set(reference, player, { merge: true })
    }

    return dbBatch
}

// commit all batches in parallel and wait for them to finish
await Promise.all(batches.map((b) => b.commit())) 

console.log("Synced successfully!")

笔记:

  • 作为您正在做的事情的扩展,您可能希望存储响应的缓存标头,例如ETagor Last-Modified。这使您可以在下载数据之前询问第三方服务器数据是否发生了变化。
  • 我已将您的替换if (condition) { /* do lots of work */ } else { /* do small amount of work to handle error */ }if (!condition) { /* do small amount of work to handle error */ return; } /* do lots of work */. 这被称为“快速失败”,用于防止大型和/或嵌套if-else树,同时还在导致错误的原因旁边显示错误处理。
  • 如果任何一个批次失败,那么其他批次可能无法成功写入数据库,但这个错误也存在于您的原始代码中。您可以将最后几行更改为以下内容,以使它们不会杀死其他批次:
// commit all batches in parallel and wait for them to finish
const results = await Promise.all(batches.map(
    (b) => b.commit().then(
        () => ({success: true}),
        (error) => ({success: false, error})
    )
))

let succeeded = 0, failed = 0

results.forEach(result => result.success ? succeeded++ : failed++)

if (failed > 0) {
    console.log(`Synced ${succeeded}/${results.length} batches of players successfully!`)
    return
}

console.log("Synced all players successfully!")

您也不是第一个遇到批量写入数据库的问题的人。MultiBatch有一个为您处理批处理的实用程序类很常见。

const response = await fetch(url)

if (!response.ok || response.status === 204) {
  // A 204 code will break response.json() with a parsing error
  // You might want to check for the 429 status here
  throw new Error(`Unexpected status code ${response.status}!`)
}

const json = await response.json() // note: empty bodies will throw a parsing error

if (!json.hasOwnProperty("data")) {
  throw new Error(`Unexpected response body!`, json)
}

const teams = json["data"]
const multiBatch = new MultiBatch(database)
const playersColRef = database.collection("players")

teams.forEach((team) => {
     team.squad.data // the array of players in this team
         .forEach(player => {
             const reference = playersColRef.doc(`${player.player_id}`)
             multiBatch.set(reference, player, { merge: true })
         })
})

await multiBatch.commit(/* pass true here to suppress errors */)

console.log("Synced successfully!")

推荐阅读