首页 > 解决方案 > 删除字符串列表中的子字符串,维护顺序 - Python

问题描述

我有一个带有字符串的列表列表。

words = [['gamma_ray_bursts','merger','death','throes','magnetic_flares','neutrino_antineutrino','objections','bursts','double_neutron_star','parker_instability','positrons'],
 ['dot','gravitational_lensing','splittings','limits','amplifications','time_delays','extracting_information','fix','distant_quasars'],
 ['recoil','gamma_ray_bursts','neutron_stars','jennings','possible_origins','birthplaces','disjoint','arrival_directions'],
 ['sn_sn','type_ii_supernovae','distances','dilution','extinction','extragalactic_distance_scale','expanding_photosphere','distance','photospheres','supernovae_sn','span_wide_range'],
 ['photon_pair','high_energy','gamma_ray_burst','optical_depth','absorbing_medium','implications','problem','annihilation_radiation','emergent_spectrum','limit','radiation_transfer','collimation','regions']]
  1. 如果它是另一个元素的子字符串,我想删除列表的任何元素。
  2. 我想保留订单

' 我试过这个循环:

for string_list in words:
    for item in string_list: 
        for item1 in string_list:
            if item in item1 and item!= item1:
                string_list.remove(item)

它似乎适用于较小的列表列表,但是当我增加列表的 len 时会输出错误。

ValueError                                Traceback (most recent call last)
<ipython-input-91-7546f608171f> in <module>
      4         for item1 in string_list:
      5             if item in item1 and item!= item1:
----> 6                 string_list.remove(item)

ValueError: list.remove(x): x not in list

预期输出:

words = [['gamma_ray_bursts','merger','death','throes','magnetic_flares','neutrino_antineutrino','objections','double_neutron_star','parker_instability','positrons'], ['dot','gravitational_lensing','splittings','limits','amplifications','time_delays','extracting_information','fix','distant_quasars'],['recoil','gamma_ray_bursts','neutron_stars','jennings','possible_origins','birthplaces','disjoint','arrival_directions'], ['sn_sn','type_ii_supernovae','distances','dilution','extinction','extragalactic_distance_scale','expanding_photosphere','photospheres','supernovae_sn','span_wide_range'],['photon_pair','high_energy','gamma_ray_burst','optical_depth','absorbing_medium','implications','problem','annihilation_radiation','emergent_spectrum','limit','radiation_transfer','collimation','regions']]

我搜索了论坛,有一个非常相似的问题,解决方案有时有效,但有时会输出错误,发生此错误的位置并不一致。列表的长度是可变的。 Python - 从字符串列表中删除作为另一个元素的子字符串的任何元素

标签: python-3.x

解决方案


与其从列表中删除元素,不如创建一个符合您要求的新元素(因为更安全)?

# method to filter out substrings
def substr_in_list(elem, lst):
  for s in lst:
    if elem != s and elem in s:
      return True
  return False

words = [[j for j in i if not substr_in_list(j, i)] for i in words]

输出 :

[['gamma_ray_bursts', 'merger', 'death', 'throes', 'magnetic_flares', 'neutrino_antineutrino', 'objections', 'double_neutron_star', 'parker_instability', 'positrons'], ['dot', 'gravitational_lensing', 'splittings', 'limits', 'amplifications', 'time_delays', 'extracting_information', 'fix', 'distant_quasars'], ['recoil', 'gamma_ray_bursts', 'neutron_stars', 'jennings', 'possible_origins', 'birthplaces', 'disjoint', 'arrival_directions'], ['sn_sn', 'type_ii_supernovae', 'distances', 'dilution', 'extinction', 'extragalactic_distance_scale', 'expanding_photosphere', 'photospheres', 'supernovae_sn', 'span_wide_range'], ['photon_pair', 'high_energy', 'gamma_ray_burst', 'optical_depth', 'absorbing_medium', 'implications', 'problem', 'annihilation_radiation', 'emergent_spectrum', 'limit', 'radiation_transfer', 'collimation', 'regions']]

推荐阅读