python - 如何转换具有主文本和多个子文本的字典

问题描述

我有一个文本文件。该文件包含行。几行之后有一个空行。此行用于指示节的结束。第一个空白行用于指示正文的结束和子文本的开始。当检测到另一个空白行时，这意味着子文本部分已经完成并且新的主文本部分开始。

我已经编写了一些代码来在 python 中解决这个问题。主文本用作 python 字典中的键，而子文本用作该键的值。多个子文本被存储为一个列表。

在代码中，变量如下：

 word  : Empty dictionary
 value : List  containing the sub headings
 key   : Contains the current main heading
 i     : set to 1 at the start to get the first line, when a 
         new line is detected, it changes to -1. When another 
         empty line is detected, it changes to 1 again.

这里 1 表示行包含主文本，-1 表示子文本。

此处如果 i 为 1，则将正文添加到键中。如果为 -1，则将子文本添加到值列表中。

如果我们检测到另一个空行，我们检查 i 是否为 -1，如果为真，我们设置用 {key : value} 更新字典。

然后我们再次改变 i 的符号。

我的问题是程序似乎处于无限循环中。

感谢您阅读我的问题。任何帮助将不胜感激。

import json

class test1:

    word = {}
    value = []
    i = 1
    key = ''
    filepath = 'we.txt'
    with open(filepath) as fp:
            lines = fp.readlines()
            for j in range(0, len(lines)):
                    currentline = lines[j]
                    if i == 1:
                            key = currentline

                    if currentline in ['\n', '\r\n']:
                            if i == -1:
                                    word.update({key: value})

                    i = i * -1

                    if i == -1:
                            value.append(currentline)
            print(word)

输出应该是

mainText11：['subtext1'，'subtext2'] maintext2：['subtext1'，'subtext2'，'subtext3']

we.txt 包含以下内容：

                  main heading1

                  sub heading1
                  sub heading2

                  main heading2

更新：我对代码进行了一些更改。但问题依然存在。

标签： pythonpython-3.xfile

解决方案

要遍历文件的行，这就是我要做的：

with open(filepath) as fp:
    lines = fp.readlines() # read all the lines from the file
    for line in lines: # loop over the list containing all lines
        # same as in your while loop

在您的代码line中不会在while循环内更改，这就是为什么它永远不会结束，您永远不会读取超过一行的文件。

编辑：

这是您的代码（我尝试对其进行尽可能少的更改）：

word = {}
value = []
i = 1
key = ''
filepath = 'we.txt'
with open(filepath) as fp:
        lines = fp.readlines()
        for j in range(0, len(lines)):
                currentline = lines[j]

                if currentline in ['\n', '\r\n']:
                        if i == -1:
                                word.update({key: value})
                                value = [] # start with empty value for the next key
                        i = i * -1 # switch only if we read a newline
                        continue # skip to next line (the newline shouldn't be stored)

                # store values only after we know, it's not an empty line
                if i == 1:
                        key = currentline()
                if i == -1:
                        value.append(currentline)

        word.update({key: value}) # update also with the last values
        print(word)

这些值最后会有换行符。为了摆脱那些我可能会在循环的第一行：

                currentline = lines[j].strip() # strip line, so it doesn't end with '\n'
                if not currentline: # if currentline is empty

此外，您可以将整个循环移到 with 之外。

希望这可以帮助！

python - 如何转换具有主文本和多个子文本的字典

问题描述

解决方案

推荐阅读