首页 > 解决方案 > 在长字符串中搜索关键字后的数字

问题描述

我在一个很长的字符串中搜索几个关键字,并想提取它们后面列出的数字。

如果关键字是单独存在并且未与此处所示的另一个单词连接(temperature/melting),则下面的我的代码有效,第二个问题是°数字后面的符号。

我需要一个通用的解决方案,因为总是可以有不同的词附加/链接到关键字。

import re

keywords = ["temperature", "humidity", "pressure"]
txt = "Lorem Impsum temperature/melting point: 370° C don't ...."
# it works if the string is as followed
#txt = "Lorem Impsum temperature /melting point: 370 ° C don't ...."

for l in range(len(keywords)):
    ResSearch = re.search(keywords[l], txt)
    if ResSearch != None:
        list_txt = txt.split()
        print(f"{ResSearch} \n{list_txt}")
        try:
            j = 0
            while True:
                    try:
                        j += 1
                        next_number = int(list_txt[list_txt.index(keywords[l]) + j])
                        print(f"Number after Keyword: {keywords[l]} is {next_number}")
                        break
                    except Exception as r:
                        print(r)

        except Exception as e:
            print(e)

标签: pythonregex

解决方案


这是正则表达式的经典用例:

import re

keywords = ["temperature", "humidity", "pressure"]
txt = "Lorem Impsum temperature/melting point: 370° C don't ...."

for keyword in keywords:
    match = re.search(f"{keyword}.*?(\d+)", txt)
    if match:
        print(f"Number after Keyword: {keyword} is {int(match.group(1))}")

.*?将匹配keyword("/melting point: "在这种情况下) 后面的任何字符串,并且\d+只会匹配一个或多个数字 (0 到 9)。

这将打印

Number after Keyword: temperature is 370

推荐阅读