javascript - Cloud Storage 触发 Cloud Function 总是超时并超过期限
问题描述
我正在使用Google Cloud Storage 事件来触发将上传的 CSV 写入 Cloud Datastore 的云功能。问题是 csv 文件有多个,8000 rows
并且函数的最大超时时间9mins
不够。
Error: 4 DEADLINE_EXCEEDED: Deadline exceeded
我也尝试了批处理操作,但它仍然是同样的超时问题。在不涉及太多重新架构的情况下,我可以使用替代解决方案吗?
const {Datastore} = require('@google-cloud/datastore');
const {Storage} = require('@google-cloud/storage');
const db = new Datastore();
const storage = new Storage();
const path = require('path');
const os = require('os');
const fs = require('fs');
const csv = require('csv-parser');
exports.updateMasterlist = async (object, context, callback) => {
const fileBucket = object.bucket;
const filePath = object.name;
const bucket = storage.bucket(fileBucket);
const fileName = path.basename(filePath);
const tempFilePath = path.join(os.tmpdir(), fileName);
await bucket.file(filePath).download({destination: tempFilePath})
var total = 0;
var max = 0;
var employees = [];
var batch = 1;
const kind = 'masterlist';
fs.createReadStream(tempFilePath)
.pipe(csv())
.on('data', (record) => {
let key = record['EmployeeID'];
var empKey = db.key([kind, key]);
const employee = {
emp_id: record['EmployeeID'],
full_name: `${record['Firstname']} ${record['MiddleName']} ${record['Lastname']}`,
group: record['GroupName'],
division: record['Division'],
department: record['Department'],
is_id: record['SupervisorID'],
email_address: record['Email']
};
const emp_entity = {
key: empKey,
data: employee,
};
employees.push(emp_entity);
total++; max++;
if (max >= 499){
try {
db.upsert(employees);
console.log(`Uploading batch ${batch}`);
batch++;
}
catch (e) {
console.error(e);
process.exit(1);
}
employees.length = 0;
max=0;
}
})
.on('end', async () => {
try {
await db.upsert(employees);
console.log(`Uploading batch ${batch}`);
}
catch (e) {
console.error(e);
process.exit(1);
}
console.log("End of CSV file read!");
console.log(`BATCH INFORMATION: `);
console.log(`number of employees: ${total}`);
});
callback();
};
解决方案
这是使用 Python 3.7 的解决方案
from google.cloud import datastore
import csv
import time
start_time = time.time()
db = datastore.Client()
employees = []
count=1
with open('masterlist.csv', 'rt') as f:
reader = csv.DictReader(f)
for record in reader:
# Datastore Entities
key = record['EmployeeID']
empKey = db.key('masterlist', key)
employee = datastore.Entity(key=empKey)
employee.update({
'emp_id': record['EmployeeID'],
'full_name': str(record['Firstname']+' '+record['MiddleName']+' '+record['Lastname']),
'group': record['GroupName'],
'division': record['Division'],
'department': record['Department'],
'is_id': record['SupervisorID'],
'email_address': record['Email']
})
employees.append(employee)
if(len(employees) >= 499):
db.put_multi(employees)
employees.clear()
print('Batch: '+str(count)+' added')
count += 1
db.put_multi(employees)
print('Batch: '+str(count)+' added')
print("--- %s seconds ---" % (time.time() - start_time))
推荐阅读
- bitmapfactory - D/skia: --- SkAndroidCodec::NewFromStream 返回 null
- javascript - 如何获取一组输入字段值并通过 Ajax 调用传递它
- email - 从电子邮件到后端模块的 Typo3 链接
- web-services - 如何使用 Autodesk Forge API 通过 Web 应用程序为 3D 模型着色?
- python - Anaconda Navigator 在初始化时崩溃(Python 已停止工作)
- python - 如何将正则表达式中的变量中的正则表达式用于整数
- google-cloud-platform - 运行 Google 的 Cloud Compose 时,dag 无法使用 Airflow dag 依赖项
- android - 清除堆栈而不调用 startActivity()
- php - 手机无法访问网站
- spring - MockMVC 控制器在存储库上测试 NoSuchBeanDefinitionException