python - 使用 Javascript 从 Notepad++ 中的参考文件中将 [[Words]] 替换为其他 [[Words]]
问题描述
我有一个看起来像这样的翻译文件:
Apple=Apfel
Apple pie=Apfelkuchen
Banana=Banane
Bananaisland=Bananen Insel
Cherry=Kirsche
Train=Zug
...500 多行这样的行
现在我有一个需要处理文本的文件。只需要替换文本的某些部分,例如:
The [[Apple]] was next to the [[Banana]]. Meanwhile the [[Cherry]] was chilling by the [[Train]].
The [[Apple pie]] tastes great on the [[Bananaisland]].
结果需要
The [[Apfel]] was next to the [[Banane]]. Meanwhile the [[Kirsche]] was chilling by the [[Zug]].
The [[Apfelkuchen]] tastes great on the [[Bananen Insel]].
手动复制/粘贴的事件太多了。如上所述,搜索 [[XXX]] 并从另一个文件替换的简单方法是什么?
我试图为此寻求帮助很多小时,但无济于事。我得到的最接近的是这个脚本:
import re
separators = "=", "\n"
def custom_split(sepr_list, str_to_split):
# create regular expression dynamically
regular_exp = '|'.join(map(re.escape, sepr_list))
return re.split(regular_exp, str_to_split)
with open('D:/_working/paired-search-replace.txt') as f:
for l in f:
s = custom_split(separators, l)
editor.replace(s[0], s[1])
但是,这样会替换太多,或者不一致。例如 [[Apple]] 被 [[Apfel]] 正确替换,但 [[File:Apple.png]] 被错误地替换为 [[File:Apfel.png]] 并且 [[Apple pie]] 被 [[ Apfel pie]],所以我尝试连续数小时调整正则表达式无济于事。有没有人有任何信息 - 请用非常简单的术语 - 我如何解决这个问题/实现我的目标?
解决方案
这有点棘手,因为 [ 是正则表达式中的元字符。
我确信有一种更有效的方法可以做到这一点,但这很有效:
replaces="""Apple=Apfel
Apple pie=Apfelkuchen
Banana=Banane
Bananaisland=Bananen Insel
Cherry=Kirsche
Train=Zug"""
text = """
The [[Apple]] was next to the [[Banana]]. Meanwhile the [[Cherry]] was chilling by the [[Train]].
The [[Apple pie]] tastes great on the [[Bananaisland]].
"""
if __name__ == '__main__':
import re
for replace in replaces.split('\n'):
english, german = replace.split('=')
text = re.sub(rf'\[\[{english}\]\]', f'[[{german}]]', text)
print(text)
输出:
The [[Apfel]] was next to the [[Banane]]. Meanwhile the [[Kirsche]] was chilling by the [[Zug]].
The [[Apfelkuchen]] tastes great on the [[Bananen Insel]].