首页 > 解决方案 > 奇怪的 .Join() 行为

问题描述

出于某种原因,我的代码中的“”.join() 似乎在不应该有的地方添加了额外的空格。抱歉,如果这是一个非常菜鸟的问题,但我不知道尽管通常能够弄清楚这样的事情。

有问题的代码(只是一个片段,但其余部分不相关)

def translate(stringinput):
    if all(c in string.printable for c in stringinput):
        output = ""
        sent_detector = nltk.data.load('tokenizers/punkt/english.pickle')
        sentences = sent_detector.tokenize(stringinput.strip())
        for sentence in sentences:
            sentence = shuffle(sentence)
            output = output + " " + sentence_translate(sentence)
        print(output.split())
        " ".join(output.split())
        return output.strip()
    else:
        print("Input does not entirely consist of ASCII Characters. Offending characters were:")
        print([c for c in stringinput if c not in string.printable])

stringinput = "Ulysses, Ulysses - Soaring through all the galaxies. In search of Earth, flying in to the night. Ulysses, Ulysses - Fighting evil and tyranny, with all his power, and with all of his might. Ulysses - no-one else can do the things you do. Ulysses - like a bolt of thunder from the blue. Ulysses - always fighting all the evil forces bringing peace and justice to all."
print(translate(stringinput))
writer(folder / "final.json", dict)

有问题的行是

print(output.split())
" ".join(output.split())
return output.strip() -> which is then printed out via print(translate(stringinput))

这两个的打印输出是:

['kwmuo', 'kwmuo', 'jhhdd', 'zzazyayb', 'ptictte', 'igbo', 'tkaty', 'puiq.', 'xpaiuc', 'ftucqtze', 'ossjjh', 'ywwuh', 'rpauuqqz', 'fddu', 'pfhqys', 'igbo', 'kwmuo', 'qpousq,', 'zaapyuwq,', 'zqaoys,', 'histje', 'kwmuo', 'uzzaa', 'ptictte', 'eczt', 'rkmwy', 'uzzaa,', 'zaapyuwq,', 'ptictte,', 'xpaiuc,', 'eczt,', 'rssjj', 'kwmuo', 'hydymw', 'mfusq', 'gotsejz', 'igbo', 'mkpwhu', 'mkpwhu', 'os', 'gooss', 'teezc', 'kwmuo', 'dyyww', 'gtokb.', 'xpaiuc', 'cxxppu,', 'uqqzzan', 'igbo', 'gooss', 'kwmuo', 'hdyyyy', 'itfe.', 'uqqlos', 'ptictte', 'igbo', 'zqaoys', 'ywwhuyq', 'zaapyuwq', 'hdyyyy', 'osgjhhy', 'ptictte', 'rpauuqqz']

kwmuo kwmuo jhhdd zzazyayb ptictte igbo tkaty  puiq. xpaiuc ftucqtze ossjjh ywwuh rpauuqqz fddu pfhqys igbo  kwmuo qpousq, zaapyuwq, zqaoys, histje kwmuo uzzaa ptictte eczt rkmwy uzzaa, zaapyuwq, ptictte, xpaiuc, eczt, rssjj  kwmuo hydymw mfusq gotsejz igbo mkpwhu mkpwhu os gooss  teezc kwmuo dyyww gtokb. xpaiuc cxxppu, uqqzzan igbo gooss  kwmuo hdyyyy itfe. uqqlos ptictte igbo zqaoys ywwhuyq zaapyuwq hdyyyy osgjhhy ptictte rpauuqqz

例如,如果您在tkatypuiq之间查看,它们的两个数组条目显然都没有尾随或前导空格,那么为什么连接的版本之间显然有两个空格?这在整个输出中偶尔持续,没有明显的模式。这是可重现的,我已经多次运行代码,结果完全相同。

有任何想法吗?

标签: pythonpython-3.xstring

解决方案


您必须将join方法的输出分配给某些东西,它不能就地工作:

print(output.split())
" ".join(output.split())
return output.strip()

应该

print(output.split())
output = " ".join(output.split())
return output.strip()

推荐阅读