首页 > 解决方案 > 如何通过没有空格的子字符串拆分字符串,同时保留其原始空格?

问题描述

我正在寻找一种方法,通过删除了空格的目标短语来拆分带有空格(包括空格、\n、\t)的字符串。这应该能够在目标短语之前和之后完成。我还必须保留原始字符串及其空格。

由于目标短语可能第 n 次出现,我只希望按第一次出现拆分并获取它之前的字符,并按最后一次出现拆分并获取它之后的字符。

例如:

str = 'This is a test string for my test string example only.'
target_phrase = 'teststring'

预期输出:

('This is a', 'test string for my test string example only.) #Split by target phrase and getting characters prior to it
('This is a test string for my test string', 'example only.') #Split by target phrase and getting characters after it

强调文本

感激地收到任何提示。

标签: pythonregex

解决方案


这是可以接受的吗(当找不到目标短语时,它不会费心处理这种情况):

# Splits str at the first occurrence of targ, ignoring spaces in both.
# Returns tuple of substrings produced by the split.
def my_split(str, targ):
    idx = str.replace(' ', '').index(targ)

    # Next, in the original string that has spaces,
    # we count the number of spaces and non-spaces, until
    # the number of non-spaces reaches idx. When that happens,
    # it means we have reached the split-point in the original
    # string that has spaces.
    non_space = 0
    space = 0
    while (non_space < idx) and ((non_space+space) < len(str)):
        if str[space+non_space] == ' ':
            space += 1
        else:
            non_space += 1
    if (space + non_space):
        return (str[:space+non_space], str[1+space+non_space:])
    else:
        return ('', str)

用法:

print (my_split(str, target_phrase))
print (tuple(s[::-1] for s in my_split(str[::-1], target_phrase[::-1]))[::-1])

输出:

('This is a', 'test string for my test string example only.')
('This is a test string for my test string', 'example only.')

推荐阅读