首页 > 解决方案 > How to determine if two elements from a list appear consecutively in a string? Python

问题描述

I am trying to solve a problem that can be modelled most simply as follows.

I have a large collection of letter sequences. The letters come from two lists: (1) member list (2) non-member list. The sequences are of different compositions and lengths (e.g. AQFG, CCPFAKXZ, HBODCSL, etc.). My goal is to insert the number '1' into these sequences when any 'member' is followed by any two 'non-members':

Rule 1: Insert '1' after the first member letter that is followed 
by 2 or more non-members letters.
Rule 2: Insert not more than one '1' per sequence.

The 'Members': A, B, C, D
'Non-members': E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z 

In other words, once a member letter is followed by 2 non-member letters, insert a '1'. In total, only one '1' is inserted per sequence. Examples of what I am trying to achieve are this:

AQFG        --->   A1QFG
CCPFAKXZ    --->   CC1PFAKXZ
BDDCCA      --->   BDDCCA1
HBODCSL     --->   HBODC1SL
ABFCC       --->   ABFCC
ACKTBB      --->   AC1KTBB # there is no '1' to be inserted after BB 

I assume the code will be something like this:

members = ['A','B','C','D']
non_members = ['A','B','C','D','E','F','G','H','I','J','K','L','M','N',
'O','P','Q','R','S','T','U','V','W','X','Y','Z']
strings = ['AQFG', 'CCPFAKXZ', 'BDDCCA', 'HBODCSL', 'ABFCC']

for i in members:
    if i in strings:
        if member is followed by 2 non-members: # Struggling here
            i.insert(index_member, '1')            
        return i
return ''

EDIT

I have found that one solution could be to generate a list of all permutations of two 'non-member' items using itertools.permutations(non_members, 2), and then test for their presence in the string.

But is there a more elegant solution for this problem?

标签: python-3.xinsert

解决方案


Generating all permutations is going to explode the number of things you are checking. you need to change how you are iterating something like:

members = ...
non_members = ...
s = 'AQFG'
out = ""
look = 2
for i in range(len(s)-look):
    out += s[i]
    if (s[i] in members) & \
       (s[i+1] in non_members) & \
       (s[i+2] in non_members):
           out += '1' + s[i+1:]
           break

This way you only need to go through the target string once, and you don't need to generate permutations, this method could be extended to look ahead many more than your method.


推荐阅读