首页 > 解决方案 > Python提取特定子字符串

问题描述

所以我在 Python 中有以下字符串:

teststr = "First Line.............................234" \
          "1.1.0 (L1) TestLine.........................567" \
          "1.1.1 (L1) Second Line.............................587"\
          "Third Line.............................856" \
          "1.1.2 (L2) Fourth Line.............................775"\
          "1.2.7 (L1) Fifth Line.............................262" \
          "1.5.3 (L1) Sixth Line .............................346"\
          "Seventh Line..............................234"

我只需要将(L1)中的信息保存在列表中。

我无法遍历行(如果行包含 L1 或其他内容),因为有时(L1)信息需要的信息超过在线(例如第二行第三行)。

我尝试了很多尝试再次拆分和加入字符串,但对我来说没有任何效果。

有谁知道我该怎么做?

标签: pythonstringlistsplit

解决方案


您可以在正则表达式上拆分字符串,然后遍历数据:

teststr = "First Line.............................234" \
          "1.1.0 (L1) TestLine.........................567" \
          "1.1.1 (L1) Second Line.............................587"\
          "Third Line.............................856" \
          "1.1.2 (L2) Fourth Line.............................775"\
          "1.2.7 (L1) Fifth Line.............................262" \
          "1.5.3 (L1) Sixth Line .............................346"\
          "Seventh Line..............................234"
import re
results = re.split(r'(\(L\d+\))',teststr)

这会将输入拆分为类似于(Ln)wheren可以是任何数字的任何值。

它给出了一个包含以下值的列表:

['First Line.............................2341.1.0 ',
 '(L1)',
 ' TestLine.........................5671.1.1 ',
 '(L1)',
 ' Second Line.............................587Third Line.............................8561.1.2 ',
 '(L2)',
 ' Fourth Line.............................7751.2.7 ',
 '(L1)',
 ' Fifth Line.............................2621.5.3 ',
 '(L1)',
 ' Sixth Line .............................346Seventh Line..............................234']

在这种情况下,我们只想选择 after 的值(L1),所以我们在列表上循环(滑动)并仅在它之后打印值(L1)

for x, y in zip(results, results[1:]):
  if x == '(L1)':
    print(y)

完整代码变为:

teststr = "First Line.............................234" \
          "1.1.0 (L1) TestLine.........................567" \
          "1.1.1 (L1) Second Line.............................587"\
          "Third Line.............................856" \
          "1.1.2 (L2) Fourth Line.............................775"\
          "1.2.7 (L1) Fifth Line.............................262" \
          "1.5.3 (L1) Sixth Line .............................346"\
          "Seventh Line..............................234"
import re
results = re.split(r'(\(L\d+\))',teststr)

for x, y in zip(results, results[1:]):
  if x == '(L1)':
    print(y)

这使:

 TestLine.........................5671.1.1 
 Second Line.............................587Third Line.............................8561.1.2 
 Fifth Line.............................2621.5.3 
 Sixth Line .............................346Seventh Line..............................234

推荐阅读