首页 > 解决方案 > 使用正则表达式对列表中的字符串进行分组

问题描述

我正在尝试使用正则表达式根据相似性对项目进行分组,因此我可以将它们分组为更少,而不是拥有很多项目。但它没有按预期工作,并且输出错误。以下是预期输出和我当前的输出。

小例子:'k1', 'k2', 'k3', 'k4'->'k(1|2|3|4)'

实际代码:

import re

loc_list = [
    'phone100-500-cas-ras9-f51-s10-k2',
    'phone100-500-cas-ras9-f52-s10-k2',
    'phone100-500-cas-ras9-f50-s10-k2',
    'phone100-500-cas-ras9-f50-s9-k3',
    'phone100-500-cas-ras9-f50-s9-k1',
    'Telephone100-500-cas-ras9-f50-s9-k2']

split_loc_list = [phone.split("-") for phone in loc_list]

locs = {}

for loc in split_loc_list:
    locs.setdefault("-".join(loc[0:4]), {}).\
                        setdefault("f", set()).add(loc[4].strip("f"))
    locs.setdefault("-".join(loc[0:4]), {}).\
                        setdefault("s", set()).add(loc[5].strip("s"))
    locs.setdefault("-".join(loc[0:4]), {}).\
                        setdefault("k", set()).add(loc[6].strip("k"))
prove = []
for loc, vals in locs.items():
    f_vals_sorted = sorted(list(map(int, vals["f"])))
    f_vals_joined = "|".join(map(str, f_vals_sorted))
    if "|" in f_vals_joined:
        f_vals_joined = f"({f_vals_joined})"
    s_vals_sorted = sorted(list(map(int, vals["s"])))
    s_vals_joined = "|".join(map(str, s_vals_sorted))
    if "|" in s_vals_joined:
        s_vals_joined = f"({s_vals_joined})"
    k_vals_sorted = sorted(list(map(int, vals["k"])))
    k_vals_joined = "|".join(map(str, k_vals_sorted))
    if "|" in k_vals_joined:
        k_vals_joined = f"({k_vals_joined})"
    prove.append(f"{loc}-f{f_vals_joined}-s{s_vals_joined}-k{k_vals_joined}")
print("|".join(prove))

错误的电流输出:

phone100-500-cas-ras9-f(50|51|52)-s(9|10)-k(1|2|3)|Telephone100-500-cas-ras9-f50-s9-k2

预期输出:

Telephone100-500-cas-ras9-f50-s9-k2|phone100-500-cas-ras9-f50-s9-k(1|3)|phone100-500-cas-ras9-f(50|51|52)-s10-k2

标签: pythonregexpython-3.xgrouping

解决方案


推荐阅读