python-3.x - 将文本文件中的特定行作为字典中值列表的第 1 项,并将后续行作为 1 个字符串附加到同一列表的第 2 项
问题描述
我有一个要导入字典的文本文件,但我在试图让程序将正确的行号识别为字典列表中的第 1 项和第 2 项时遇到问题
文本文件的格式是这样的(每行之间没有空行,只有在每条记录的末尾,有一个换行符):
ProductA
2020-08-03 16:26:21
This painting was done by XNB.
The artist seeks to portray the tragedies caused by event XYZ.
The painting weighs 2kg.
####blank line#####
ProductB
2020-08-03 16:26:21
This painting is done by ONN.
It was stolen during world war 2.
Decades later, it was discovered in the black market of country XYZ.
It was bought for 2 million dollars by ABC.
###blank line###
字典中的期望结果:
{ 'ProductA' : ['2020-08-03 16:26:21', 'This painting was done by XNB.The artist seeks to portray the tragedies caused by event XYZ. The painting weighs 2kg.'], 'ProductB':['2020-08-03 16:26:21','This painting is done by ONN.This painting is done by ONN.Decades later, it was discovered in the black market of country XYZ.It was bought for 2 million dollars by ABC.']}
其中 item_2 是一个字符串,它从第 3 行开始组合到与空行相交的信息末尾。
问题:我不知道如何对逻辑进行编码,以便程序能够正确地将其分配到我想要的位置。
header = ""
header = True
for line in records:
data = line.splitlines()
if line!= '\n': # check for line break which indicate new record
if Header: #
#code which will assign 1st line of each record as key to dictionary
else:
# This is where I need help.
# Code which will assign 2nd line as item_1 and then assign 3rd lines onwards till the end of record as item_2 in a single string.
# items_2 may have different number of lines being combined into 1 string for each record.
# I try to form a rough idea how the logic might be in code below but I feel that something is missing and I got a bit confused.
for line in list: # result in TypeError, 'type' object is not iterable.
dict[line[1]] = dict[header].append(line[1].strip("\n"))
# Since the outer if has already done its job of identifying 1st line of record. The line of code seeks to assign the next line (line 2 in text file) which I think would be interpreted by the program as line[1] to item 2.
dict[line[2:]] = dict[header].append(line[2:].strip("\n"))
# Assign 3rd line of text file onwards as a single string which is item_2 in the list of value for dictionary.
else:
#code which reset boolean for header
解决方案
尝试这个:
with open('data.txt') as fp:
data = fp.read().split('\n\n')
res = {}
for x in data:
k, v = x.strip().split('\n', 1)
v = v.split('\n')
res[k] = [v[0], ' '.join(v[1:])]
print(res)
输出:
{'ProductA': ['2020-08-03 16:26:21', 'This painting was done by XNB. The artist seeks to portray the tragedies caused by event XYZ. The painting weighs 2kg.'], 'ProductB': ['2020-08-03 16:26:21', 'This painting is done by ONN. It was stolen during world war 2. Decades later, it was discovered in the black market of country XYZ. It was bought for 2 million dollars by ABC.']}
推荐阅读
- java - 创建 JSON 数据并快速解析
- .htaccess - htaccess 删除域的开头
- c# - ASP.Net - 托管 Web API 源代码而不发布
- android - 无法在不同的 PC 中使用 Firebase 登录 - 错误的 OAuth2 相关配置
- java - 在 Mac 操作系统上构建 OpenJDK 9
- node.js - 无法在 fetchRevs 中使用“include_docs”,因为它既不在 BulkFetchDocsWrapper 接口中,也不在 DocumentFetchParams 接口中
- java - 没有得到纬度和经度值
- c# - MethodBuilder 不支持 GetILGenerator 方法
- php - PHP隐藏每个单词的最后3个字符
- sql - 为表中的行创建组号