python - Python:在字典中构建复杂的嵌套列表
问题描述
我正在查看从 Excel 电子表格的字典中构建列表列表。
我的电子表格如下所示:
source_item_id | target_item_id | find_sting | replace_sting |
---|---|---|---|
source_id1 | target_id1 | abcd1 | efgh1 |
source_id1 | target_id1 | ijkl1 | mnop1 |
source_id1 | target_id2 | abcd2 | efgh2 |
source_id1 | target_id2 | ijkl2 | mnop2 |
source_id2 | target_id3 | 第一时间 | uvwx |
source_id2 | target_id3 | 伊扎布 | 定义 |
source_id2 | target_id4 | 吉吉 | 克尔曼 |
source_id2 | target_id4 | opqr | 斯图夫 |
我的输出字典应该是这样的:
{ "source_id1": [{ "target_id1": [{ "find_string": "abcd1", "replace_string": "efgh1" }, { "find_string": "ijkl1", "replace_string": "mnop1" }] }, { "target_id2": [{ "find_string": "abcd2", "replace_string": "efgh2" }, { "find_string": "ijkl2", "replace_string": "mnop2" }] }], "source_id2": [{ "target_id3": [{ "find_string": "qrst", "replace_string": "uvwx" }, { "find_string": "yzab", "replace_string": "cdef" }] }, { "target_id4": [{ "find_string": "ghij", "replace_string": "klmn" }, { "find_string": "opqr", "replace_string": "stuv" }] }] }
使用以下代码,我只能获得每个列表中的最后一个值:
import xlrd xls_path = r"C:\data\ItemContent.xlsx" book = xlrd.open_workbook(xls_path) sheet_find_replace = book.sheet_by_index(1) find_replace_dict = dict() for line in range(1, sheet_find_replace.nrows): source_item_id = sheet_find_replace.cell(line, 0).value target_item_id = sheet_find_replace.cell(line, 1).value find_string = sheet_find_replace.cell(line, 2).value replace_sting = sheet_find_replace.cell(line, 3).value find_replace_list = [{"find_string": find_string, "replace_sting": replace_sting}] find_replace_dict[source_item_id] = [target_item_id] find_replace_dict[source_item_id].append(find_replace_list) print(find_replace_dict)
--> 结果
{ "source_id1": ["target_id2", [{ "find_string": "ijkl2", "replace_sting": "mnop2" } ]], "source_id2": ["target_id4", [{ "find_string": "opqr", "replace_sting": "stuv" } ]] }
解决方案
您的问题相当复杂,因为您有一个单键字典列表作为源 ID 的值,但是您可以遵循一种模式,为相关项目解析每一行,然后使用这些来定位您的目标插入追加,或者创建新列表:
def process_line(line) -> Tuple[str, str, dict]:
source_item_id = sheet_find_replace.cell(line, 0).value
target_item_id = sheet_find_replace.cell(line, 1).value
find_string = sheet_find_replace.cell(line, 2).value
replace_string = sheet_find_replace.cell(line, 3).value
return source_item_id, target_item_id, {
"find_string": find_string,
"replace_string": replace_string
}
def find_target(target: str, ls: List[dict]) -> int:
# Find the index of the target id in the list
for i in len(ls):
if ls[i].get(target):
return i
return -1 # Or some other marker
import xlrd
xls_path = r"C:\data\ItemContent.xlsx"
book = xlrd.open_workbook(xls_path)
sheet_find_replace = book.sheet_by_index(1)
result_dict = dict()
for line in range(1, sheet_find_replace.nrows):
source, target, replacer = process_line(line)
# You can check here that the above three are correct
source_list = result_dict.get(source, []) # Leverage the default value of the get function
target_idx = find_target(target, source_list)
target_dict = source_list[target_idx] if target_idx >=0 else {}
replace_list = target_dict.get(target, [])
replace_list.append(replacer)
target_dict[target] = replace_list
if target_idx >= 0:
source_list[target_idx] = target_dict
else:
source_list.append(target_dict)
result_dict[source] = source_list
print(result_dict)
我会注意到,如果source_id
指向字典而不是列表,这可以从根本上简化,因为我们不需要在列表中搜索可能已经存在的列表项,然后根据需要笨拙地替换或附加。如果您可以更改此约束(请记住,您始终可以将字典转换为下游列表),我可能会考虑这样做。