python - Firestore 数据库写入的性能?
问题描述
操作系统:Mac OS Catalina v 10.15.1
Python版本:Python 3.7.1
我使用 Firestore 作为我的数据库,用于带有 Python SDK 的个人项目。我目前正在尝试优化我的后端,我注意到写入 Firestore 的速度很慢。以下面的示例代码为例:
import firebase_admin
from firebase_admin import credentials
from firebase_admin import firestore
import time
cred = credentials.Certificate("./path/to/adminsdk.json")
firebase_admin.initialize_app(cred)
db = firestore.client()
test_data = {f"test_field_{i}":f"test_value_{i}" for i in range(20)}
now = time.time()
db.collection(u'latency_test_collection').document(u'latency_test_document').set(test_data)
print(f"Total time: {time.time()-now}")
上面的代码运行时间超过 300 毫秒,这似乎很慢,尤其是当我有多个比上面的例子大得多的写入时。我检查了我的互联网连接,无论连接如何,性能都在这个值附近徘徊。Firestore 写入的这种性能是预期的,还是有办法为此优化我的代码?
解决方案
Like @Nebulastic said, batches are much more efficient than one by one transactions. I just ran a test from my laptop in Europe to a Firestore located in us-west2 (Los Angeles). Here are the actual results from one by one deletions and batch deletions.
$ python firestore_test.py
Creating 10 documents
Wrote 10 documents in 1.80 seconds.
Deleting documents one by one
Deleted 10 documents in 7.97 seconds.
###
Creating 10 documents
Wrote 10 documents in 0.92 seconds.
Deleting documents in batch
Deleted 10 documents in 1.71 seconds.
###
Creating 2000 documents
Wrote 2000 documents in 6.27 seconds.
Deleting documents in batch
Deleted 2000 documents in 9.80 seconds.
Here's the test code:
from time import time
from uuid import uuid4
from google.cloud import firestore
DB = firestore.Client()
def generate_user_data(entries = 10):
print('Creating {} documents'.format(entries))
now = time()
batch = DB.batch()
for counter in range(entries):
# Each transaction or batch of writes can write to a maximum of 500 documents.
# https://cloud.google.com/firestore/quotas#writes_and_transactions
if counter % 500 == 0 and counter > 0:
batch.commit()
batch = DB.batch()
user_id = str(uuid4())
data = {
"some_data": str(uuid4()),
"expires_at": int(now)
}
user_ref = DB.collection(u'users').document(user_id)
batch.set(user_ref, data)
batch.commit()
print('Wrote {} documents in {:.2f} seconds.'.format(entries, time() - now))
def delete_one_by_one():
print('Deleting documents one by one')
now = time()
docs = DB.collection(u'users').where(u'expires_at', u'<=', int(now)).stream()
counter = 0
for doc in docs:
doc.reference.delete()
counter = counter + 1
print('Deleted {} documents in {:.2f} seconds.'.format(counter, time() - now))
def delete_in_batch():
print('Deleting documents in batch')
now = time()
docs = DB.collection(u'users').where(u'expires_at', u'<=', int(now)).stream()
batch = DB.batch()
counter = 0
for doc in docs:
counter = counter + 1
if counter % 500 == 0:
batch.commit()
batch.delete(doc.reference)
batch.commit()
print('Deleted {} documents in {:.2f} seconds.'.format(counter, time() - now))
generate_user_data(10)
delete_one_by_one()
print('###')
generate_user_data(10)
delete_in_batch()
print('###')
generate_user_data(2000)
delete_in_batch()
推荐阅读
- python - 如何在 pandas DataFrame 中格式化特殊小数
- javascript - 为什么我不能在 JavaScript while 循环之外设置变量的值?
- javascript - 如何阻止把手执行字符串?
- reactjs - 未捕获的 SyntaxError:意外的标记 '<' | 在“纱线添加 xlsx”之后 | 反应打字稿
- flutter - 如何在堆栈内的缩小容器中正确居中文本小部件
- wpf - 设计时错误 Wpf ValidatesOnTargetUpdated NullReferenceException
- android - Kotlin 谷歌地图无法启动
- javascript - VSCODE Javascript模板字符串以不确定的方式变为白色并丢失突出显示
- c# - C# 反序列化 - 获取属性返回 null
- wpf - 如何在 wix 工具集引导程序项目中获取当前的 wpf 项目程序集版本?