multithreading - 为什么我在并行调用 Apps Script API 时出现超时并记录丢失的错误?
问题描述
我正在编写一个谷歌表格插件,它将一些数据从一个电子表格复制到另一个电子表格,然后重新格式化。所涉及的数据集通常很大(约 100k 行),因此为了避免达到 6 分钟的超时限制,我将数据分成块,然后使用 google.script.run 调用在每个块上并行运行数据复制功能客户端。
在我的约 100k 行的示例数据集中,成功复制了前几个要完成的块,其余的则抛出错误“访问 ID 为 [电子表格 ID] 的文档时服务电子表格超时”。
这是 Apps 脚本仪表板中的外观:
我对超时错误感到困惑,因为:
- 我已在包含 5000 行的数据集上成功运行脚本
- Apps 脚本仪表板显示在 6 分钟之前失败的执行(更像是 4-5 分钟)
- Apps 脚本仪表板日志记录显示成功记录的失败(超时)执行。记录发生在 setValues() 操作之后(见下面的代码);记录后唯一的事情是返回,所以我不明白它如何成功记录然后超时(我认为Apps Script是同步的......但也许我错了?)
我也不确定那些“未捕获”错误,但它们似乎显示在仪表板上,因为“缺少文档 [电子表格 id](可能它已被删除,或者您没有读取权限?)”
这是我要复制到的文档,我已确认它仍然存在于我的云端硬盘中,我可以打开它并查看已成功复制的数据。如果太多的脚本实例试图同时访问它,文档会“丢失”吗?
我尝试了较小的块大小(1000 和 2000 行)并得到相同类型的错误。
这是我的客户端 Javascript 的样子:
// This function is the success handler that runs after another function (which grabs the total # of rows
// from the sheet to be copied, and then creates the new spreadsheet to be copied into) completes
function dataParamsSuccess(dataParameters) {
// dataParameters = [busHrs, outputSsUrl, lastRow, maxRows, maxColumns]
var busHrs = dataParameters[0];
var outputSsUrl = dataParameters[1];
var lastRow = dataParameters[2];
var maxRows = dataParameters[3];
var maxColumns = dataParameters[4];
console.log(maxRows);
console.log(maxColumns);
// Set chunk size
var chunkSize = 5000; // number of rows in chunk
// Determine number of chunks
var numChunks = Math.ceil(lastRow / chunkSize);
var lastChunkSize = lastRow % chunkSize;
if ((numChunks-1) * chunkSize + lastChunkSize == lastRow) {
console.log("Math checks out");
} else {
console.log("oops, check your math");
}
// Generate status message
var statusHtml = numChunks + " chunks to be copied";
for (i=0; i<numChunks; i++) {
var chunkNum = i+1;
var chunkNumStr = chunkNum.toString();
statusHtml += "<div id=\"chunk" + chunkNumStr + "Status\"></div>";
}
document.getElementById("statusMsg").innerHTML = statusHtml;
var startRow = 1;
// Call copyData once for each chunk
for (i=0; i<numChunks; i++) {
var chunkNum = i+1;
var chunkNumStr = chunkNum.toString();
var chunkDivId = "chunk" + chunkNumStr + "Status";
if (chunkNum==numChunks) { // if this is the last chunk, chunk size is smaller
chunkSize = lastChunkSize;
}
var copyParams = [chunkNum, chunkSize, startRow, outputSsUrl];
google.script.run
.withSuccessHandler(copyChunkSuccess)
.copyData(copyParams);
document.getElementById(chunkDivId).innerHTML = "Chunk " + chunkNumStr + " copying in progress";
startRow += chunkSize;
console.log("startRow: " + startRow.toString());
}
// Haven't gotten to the part where I figure out what to do after all chunks are complete yet
}
这是被调用的服务器端 Apps Script 函数:
function copyData(copyParams) {
try {
// copyParams = [chunkNum, chunkSize, startRow, outputSsUrl]
var chunkNum = copyParams[0];
var chunkSize = copyParams[1];
var startRow = copyParams[2];
var outputSsUrl = copyParams[3];
var lastRow = startRow + chunkSize;
// Get input and output sheets
var dataSheet = SpreadsheetApp.getActiveSheet();
var outputSpreadsheet = SpreadsheetApp.openByUrl(outputSsUrl);
var outputSheet = outputSpreadsheet.getActiveSheet();
// Copy values
var values = dataSheet.getRange(startRow, 1, chunkSize, 22).getValues();
outputSheet.getRange(startRow, 1, chunkSize, 22).setValues(values);
// Logging
var dataSpreadsheetId = dataSheet.getParent().getId();
var outputSpreadsheetId = outputSpreadsheet.getId();
console.log("Chunk " + chunkNum.toString() + " (rows " + startRow.toString() + " through " + lastRow.toString() + ") copied successfully");
return [chunkNum, startRow, lastRow, "success"];
} catch(e) {
return [chunkNum, startRow, lastRow, e.message]; // Return error to client-side; server-side logging is taking too long
}
}
解决方案
这个答案怎么样?
根据我的经验,即使使用电子表格服务,当异步过程发生连续访问时,我也遇到过这样的问题。当时,我使用了锁服务和setTimeout
. 但我不确定这种方法是否可以解决您的问题。所以请测试以下修改。在这里,我想建议在 Google Apps 脚本端和setTimeout
Javascript 端使用锁定服务。当你的脚本被修改后,它变成如下。
此解决方法的流程如下。
流动:
- 10 个工作人员被发送到 Google Apps 脚本端。
- 发送 10 个 worker 后,等待 5 秒。
- 在 Google Apps 脚本方面,收到了 10 个工作人员。而这些都是在锁服务下处理的。
- 5 秒后,在 Javascript 端,下一个 10 个工人被发送。
通过这个循环,脚本运行。
Google Apps 脚本方面:
请copyData
进行如下修改。
function copyData(copyParams) {
var lock = LockService.getDocumentLock();
if (lock.tryLock(10000)) {
try {
// copyParams = [chunkNum, chunkSize, startRow, outputSsUrl]
var chunkNum = copyParams[0];
var chunkSize = copyParams[1];
var startRow = copyParams[2];
var outputSsUrl = copyParams[3];
var lastRow = startRow + chunkSize;
// Get input and output sheets
var dataSheet = SpreadsheetApp.getActiveSheet();
var outputSpreadsheet = SpreadsheetApp.openByUrl(outputSsUrl);
var outputSheet = outputSpreadsheet.getActiveSheet();
// Copy values
var values = dataSheet.getRange(startRow, 1, chunkSize, 22).getValues();
outputSheet.getRange(startRow, 1, chunkSize, 22).setValues(values);
// Logging
var dataSpreadsheetId = dataSheet.getParent().getId();
var outputSpreadsheetId = outputSpreadsheet.getId();
console.log("Chunk " + chunkNum.toString() + " (rows " + startRow.toString() + " through " + lastRow.toString() + ") copied successfully");
return [chunkNum, startRow, lastRow, "success"];
} catch(e) {
return [chunkNum, startRow, lastRow, e.message]; // Return error to client-side; server-side logging is taking too long
} finally {
lock.releaseLock();
}
}
}
HTML & Javascript 方面:
请dataParamsSuccess
进行如下修改。
// This function is the success handler that runs after another function (which grabs the total # of rows
// from the sheet to be copied, and then creates the new spreadsheet to be copied into) completes
async function dataParamsSuccess(dataParameters) { // <--- Modified
const wait = (s) => new Promise(r => setTimeout(r, s)); // <--- Added
// dataParameters = [busHrs, outputSsUrl, lastRow, maxRows, maxColumns]
var busHrs = dataParameters[0];
var outputSsUrl = dataParameters[1];
var lastRow = dataParameters[2];
var maxRows = dataParameters[3];
var maxColumns = dataParameters[4];
console.log(maxRows);
console.log(maxColumns);
// Set chunk size
var chunkSize = 5000; // number of rows in chunk
// Determine number of chunks
var numChunks = Math.ceil(lastRow / chunkSize);
var lastChunkSize = lastRow % chunkSize;
if ((numChunks - 1) * chunkSize + lastChunkSize == lastRow) {
console.log("Math checks out");
} else {
console.log("oops, check your math");
}
// Generate status message
var statusHtml = numChunks + " chunks to be copied";
for (i = 0; i < numChunks; i++) {
var chunkNum = i + 1;
var chunkNumStr = chunkNum.toString();
statusHtml += "<div id=\"chunk" + chunkNumStr + "Status\"></div>";
}
document.getElementById("statusMsg").innerHTML = statusHtml;
var count = 0; // <--- Added
var startRow = 1;
// Call copyData once for each chunk
for (i = 0; i < numChunks; i++) {
count++; // <--- Added
var chunkNum = i + 1;
var chunkNumStr = chunkNum.toString();
var chunkDivId = "chunk" + chunkNumStr + "Status";
if (chunkNum == numChunks) { // if this is the last chunk, chunk size is smaller
chunkSize = lastChunkSize;
}
var copyParams = [chunkNum, chunkSize, startRow, outputSsUrl];
google.script.run
.withSuccessHandler(copyChunkSuccess)
.copyData(copyParams);
if (count == 10) { // <--- Added
console.log("wait");
await wait(5000);
count = 0;
}
document.getElementById(chunkDivId).innerHTML = "Chunk " + chunkNumStr + " copying in progress";
startRow += chunkSize;
console.log("startRow: " + startRow.toString());
}
// Haven't gotten to the part where I figure out what to do after all chunks are complete yet
}
笔记:
- 我不确定是否适合您的情况
5000
。await wait(5000)
因此,请根据您的情况进行测试来修改此值。在当前值中,5000
是 5 秒。
参考:
推荐阅读
- makefile - 即使依赖项已事先显式运行,也使依赖项执行多次
- android-studio - Android Studio “无法解析符号 R”,但项目编译并运行
- c++ - C++中向量和矩阵的乘法
- python - 如何测试创建 Celery 任务的代码?
- java - 将容器连接到mysql数据库并访问它
- angular - show navigation only in inner pages and not on login/registration pages
- flutter - 如何实现 Flutter 功能
- gmail-api - 请Gmail API html帖子示例
- java - 识别来自 git 存储库的参数更改
- android - Multiple checkbox state