c# - 如何清空 CosmosDB 集合并再次填充并保留其指标/日志
问题描述
我们有一个每天运行的 Web 服务提要,并将所有文档保存到 CosmosDB 集合中,因为当新提要进来时,我无需保留旧文档,我也每天删除并重新创建集合,这有一些缺点
- 集合的统计信息被重置,因此应用程序洞察力和日志记录变得无用
- 由于所有日志等也被重置,因此几乎不可能进行故障排除
如何在向其中添加新文档之前清空 CosmosDB 集合,以便保留所有指标等?
这是我目前正在做的事情
log.LogInformation("XXX--> Deleting Collection");
await docClient.DeleteDocumentCollectionAsync(collectionLink);
log.LogInformation("XXX--> Creating Collection");
defaultCollection = await docClient.CreateDocumentCollectionIfNotExistsAsync(databaseLink, defaultCollection, new RequestOptions { OfferThroughput = 1000 });
我想要相同的结果,但保留所有统计数据等。
解决方案
您可以创建批量删除存储过程来从集合中删除所有文档,而不是删除集合。
这种存储过程的工作实现可以在这里找到:https ://github.com/CosmosDB/labs/blob/3f49d8af44468ff7640cd3e382d13ba4c0299249/solutions/05-authoring_stored_procedures/bulk_delete.js
/**
* A DocumentDB stored procedure that bulk deletes documents for a given query.<br/>
* Note: You may need to execute this stored procedure multiple times (depending whether the stored procedure is able to delete every document within the execution timeout limit).
*
* @function
* @param {string} query - A query that provides the documents to be deleted (e.g. "SELECT c._self FROM c WHERE c.founded_year = 2008"). Note: For best performance, reduce the # of properties returned per document in the query to only what's required (e.g. prefer SELECT c._self over SELECT * )
* @returns {Object.<number, boolean>} Returns an object with the two properties:<br/>
* deleted - contains a count of documents deleted<br/>
* continuation - a boolean whether you should execute the stored procedure again (true if there are more documents to delete; false otherwise).
*/
function bulkDeleteProcedure(query) {
var collection = getContext().getCollection();
var collectionLink = collection.getSelfLink();
var response = getContext().getResponse();
var responseBody = {
deleted: 0,
continuation: true
};
// Validate input.
if (!query) throw new Error("The query is undefined or null.");
tryQueryAndDelete();
// Recursively runs the query w/ support for continuation tokens.
// Calls tryDelete(documents) as soon as the query returns documents.
function tryQueryAndDelete(continuation) {
var requestOptions = {continuation: continuation};
var isAccepted = collection.queryDocuments(collectionLink, query, requestOptions, function (err, retrievedDocs, responseOptions) {
if (err) throw err;
if (retrievedDocs.length > 0) {
// Begin deleting documents as soon as documents are returned form the query results.
// tryDelete() resumes querying after deleting; no need to page through continuation tokens.
// - this is to prioritize writes over reads given timeout constraints.
tryDelete(retrievedDocs);
} else if (responseOptions.continuation) {
// Else if the query came back empty, but with a continuation token; repeat the query w/ the token.
tryQueryAndDelete(responseOptions.continuation);
} else {
// Else if there are no more documents and no continuation token - we are finished deleting documents.
responseBody.continuation = false;
response.setBody(responseBody);
}
});
// If we hit execution bounds - return continuation: true.
if (!isAccepted) {
response.setBody(responseBody);
}
}
// Recursively deletes documents passed in as an array argument.
// Attempts to query for more on empty array.
function tryDelete(documents) {
if (documents.length > 0) {
// Delete the first document in the array.
var isAccepted = collection.deleteDocument(documents[0]._self, {}, function (err, responseOptions) {
if (err) throw err;
responseBody.deleted++;
documents.shift();
// Delete the next document in the array.
tryDelete(documents);
});
// If we hit execution bounds - return continuation: true.
if (!isAccepted) {
response.setBody(responseBody);
}
} else {
// If the document array is empty, query for more documents.
tryQueryAndDelete();
}
}
}
推荐阅读
- forms - 尝试在 Gatsby 的 Hubspot 表单上设置自定义 ID
- visual-studio-code - 冲突的 eslint 规则:尾随逗号与意外标记
- javascript - 画布中的 toDataURL 函数不起作用
- gremlin - 如何查询一个顶点是否与具有相同标签的其他顶点有多个边
- java - AWT 绘画工具不是像素完美的
- ruby - 为什么 Ruby 编译器在这种情况下将预期的参数数量计算为零?
- excel - 如何在 VBA 中使用 count 来计算链接到指定数字的活动数量?
- php - 函数未正确格式化数字
- javascript - Nodejs/nestjs:从我的多个爬虫中获得 13 秒的响应时间
- python - Python - dask数据框中一系列的模棱两可的真值