首页 > 解决方案 > Python从列表中提取第一个元素并清理字符串

问题描述

我正在解析 txt 文件(长 +100 页),并想提取字符串“公开发行价格”第一次出现的句子。另外,我想清除那句话中的“ ”字符。

我在一系列文件(file_list)上运行以下代码:

test1 = [] #create a new list to store my desired output
    for eachfile in file_list:
        with open(eachfile, 'r') as f:
            for line in f:
                if "public offering price" in line:
                    test1.append(line.replace(' ','').split('.')[0])
    print(test1)

使用上面的代码,我成功地清除" "了“。”时的字符和拆分元素。存在(有助于我想要的输出的东西),但获得以下输出:

['public offering price will be between $and $per share', 'toadditional shares of our common stock at the initial public offering price', '(2)an initial public offering price of $per share']

上面的输出给了我所有的句子,包括我想要的字符串,但我只想保留第一次出现:

['public offering price will be between $and $per share]

知道如何获得这样的输出吗?鉴于我运行的代码,它必须很容易实现,但无法弄清楚如何......

非常感谢您提前,

编辑:在没有替换或拆分('.')[0] 的情况下获得的输出如下:

['public offering price will be between $&nbsp;&nbsp;&nbsp;and $&nbsp;&nbsp;&nbsp;&nbsp;per share. We intend to apply to list the common stock on\n', 'to&nbsp;&nbsp;&nbsp;&nbsp;additional shares of our common stock at the initial public offering price.</FONT>\n', '(2)&nbsp;an initial public offering price of $&nbsp;&nbsp;&nbsp;&nbsp;per share, the midpoint of the initial public offering range indicated on the cover of this prospectus. </FONT> <FONT SIZE=2>\n']

标签: pythonstringlist

解决方案


取列表的第一个元素:

first_elem = test1[0]
print(first_elem)

编辑:获取每个文件的第一个所需字符串:


test2 = [] #create a list to store all lists 
    for eachfile in file_list:
    test1 = [] #create a new list to store my desired output
        with open(eachfile, 'r') as f:
            for line in f:
                if "public offering price" in line:
                    test1.append(line.replace('&nbsp;','').split('.')[0])
        test2.append(test1)

    for test1 in test2:
        print(test1[0]) #print first element of each nested list


推荐阅读