首页 > 解决方案 > Azure 表备份检索超过 1000 行

问题描述

我希望有人可以帮助我调试这个问题。

我有以下脚本


from azure.cosmosdb.table.tableservice import TableService,ListGenerator
from azure.storage.blob import BlobServiceClient
from datetime import date
from datetime import *




def queryAndSaveAllDataBySize(tb_name,resp_data:ListGenerator ,table_out:TableService,table_in:TableService,query_size:int):
    for item in resp_data:
        #remove etag and Timestamp appended by table service
        del item.etag
        del item.Timestamp
        print("instet data:" + str(item) + "into table:"+ tb_name)
        table_in.insert_or_replace_entity(tb_name,item)
    if resp_data.next_marker:
        data = table_out.query_entities(table_name=tb_name,num_results=query_size,marker=resp_data.next_marker)
        queryAndSaveAllDataBySize(tb_name,data,table_out,table_in,query_size)


tbs_out = table_service_out.list_tables()

for tb in tbs_out:
    #create table with same name in storage2
    table_service_in.create_table(table_name=tb.name, fail_on_exist=False)
    #first query
    data = table_service_out.query_entities(tb.name,num_results=query_size)
    queryAndSaveAllDataBySize(tb.name,data,table_service_out,table_service_in,query_size)

此代码将检查表并storageA复制它们并在其中创建相同的表StorageB,并且由于每个请求的行数超过 1000 行,marker我可以拥有令牌。x_ms_continuation

不用说,这工作得很好。

但是昨天我试图对代码进行一些更改,如下所示:

If in storageA I have a table name TEST, I storageB I want to create a table named TEST20210930, basically the table name from storageA + today date

这是代码开始崩溃的地方。


table_service_out = TableService(account_name='', account_key='')
table_service_in = TableService(account_name='', account_key='')
query_size = 100

#save data to storage2 and check if there is lefted data in current table,if yes recurrence
def queryAndSaveAllDataBySize(tb_name,resp_data:ListGenerator ,table_out:TableService,table_in:TableService,query_size:int):
    for item in resp_data:
        #remove etag and Timestamp appended by table service
        del item.etag
        del item.Timestamp
        print("instet data:" + str(item) + "into table:"+ tb_name)
        table_in.insert_or_replace_entity(tb_name,item)
    if resp_data.next_marker:
        data = table_out.query_entities(table_name=tb_name,num_results=query_size,marker=resp_data.next_marker)
        queryAndSaveAllDataBySize(tb_name,data,table_out,table_in,query_size)


tbs_out = table_service_out.list_tables()
print(tbs_out)

for tb in tbs_out:
    table = tb.name + today
    print(target_connection_string)
    #create table with same name in storage2
    table_service_in.create_table(table_name=table, fail_on_exist=False)

    #first query
    data = table_service_out.query_entities(tb.name,num_results=query_size)
    queryAndSaveAllDataBySize(table,data,table_service_out,table_service_in,query_size)

这里发生的情况是代码运行到了 query_size 限制,但是却说找不到表。

我在这里有点困惑,也许有人可以帮助发现我的错误。

如果您需要更多信息,请询问

非常感谢你。

如何重现:在 azure 门户中创建 2 个存储帐户。存储 A 和存储 B。

在存储A中创建一个表并用数据填充它,超过100(基于query_size。设置配置端点。table_service_out= storageA和table_storage_in= StorageB

标签: azureazure-python-sdk

解决方案


我认为问题出在以下代码行:

data = table_out.query_entities(table_name=tb_name,num_results=query_size,marker=resp_data.next_marker)

如果您注意到,tb_name您的目标帐户中的表名称显然不存在于您的源帐户中。因为您正在从不存在的表中查询,所以您会收到此错误。

要解决此问题,您还应该将源表的名称传递给queryAndSaveAllDataBySize并在该函数中查询实体时使用该名称。

更新

请看下面的代码:

table_service_out = TableService(account_name='', account_key='')
table_service_in = TableService(account_name='', account_key='')
query_size = 100

#save data to storage2 and check if there is lefted data in current table,if yes recurrence
def queryAndSaveAllDataBySize(source_table_name, target_table_name,resp_data:ListGenerator ,table_out:TableService,table_in:TableService,query_size:int):
    for item in resp_data:
        #remove etag and Timestamp appended by table service
        del item.etag
        del item.Timestamp
        print("instet data:" + str(item) + "into table:"+ tb_name)
        table_in.insert_or_replace_entity(target_table_name,item)
    if resp_data.next_marker:
        data = table_out.query_entities(table_name=source_table_name,num_results=query_size,marker=resp_data.next_marker)
        queryAndSaveAllDataBySize(source_table_name, target_table_name, data,table_out,table_in,query_size)


tbs_out = table_service_out.list_tables()
print(tbs_out)

for tb in tbs_out:
    table = tb.name + today
    print(target_connection_string)
    #create table with same name in storage2
    table_service_in.create_table(table_name=table, fail_on_exist=False)

    #first query
    data = table_service_out.query_entities(tb.name,num_results=query_size)
    queryAndSaveAllDataBySize(tb.name, table,data,table_service_out,table_service_in,query_size)

推荐阅读