python - 如何告诉 Googletrans 忽略某些部分?
问题描述
我想使用googletrans
谷歌翻译 API。但是,其中有一些字符串是变量名:
User "%(first_name)s %(last_name)s (%(email)s)" has been deleted.
如果我通过 googletrans 使用它,我会得到
from googletrans import Translator
translator = Translator()
translator.translate(u'User "%(first_name)s %(last_name)s (%(email)s)" has been assigned.', src='en', dest='fr').text
我得到以下信息:
L'utilisateur "% (first_name) s% (last_name) s (% (email) s)" a été affecté.
但是,“%(first_name) s% (last_name)s (%(email)s)” 引入了一些字符串。有没有解决的办法?我已经尝试过:
u'User "<span class="notranslate">%(first_name)s %(last_name)s (%(email)s)</span>" has been assigned.'
解决方案
Googletrans 似乎没有受到__1__
影响。因此,您可以在翻译之前替换%(first_name)s
with __0__
,%(last_name)s
with__1__
等,然后再恢复变量。这里的代码来做到这一点:
from googletrans import Translator
import re
translator = Translator()
txtorig = u'User "%(first_name)s %(last_name)s (%(email)s)" has been assigned.'
# temporarily replace variables of format "%(example_name)s" with "__n__" to
# protect them during translate()
VAR, REPL = re.compile(r'%\(\w+\)s'), re.compile(r'__(\d+)__')
varlist = []
def replace(matchobj):
varlist.append(matchobj.group())
return "__%d__" %(len(varlist)-1)
def restore(matchobj):
return varlist[int(matchobj.group(1))]
txtorig = VAR.sub(replace, txtorig)
txttrans = translator.translate(txtorig, src='en', dest='fr').text
txttrans = REPL.sub(restore, txttrans)
print(txttrans)
结果如下:
L'utilisateur "%(first_name)s %(last_name)s (%(email)s)" a été attribué.
推荐阅读
- python - Keras 中的预测大于 1 或为负
- python-3.x - 使用 BS4 从 Span 中刮取“新”标志
- vba - Excel VBA自动填充范围(单元格())不起作用
- java - 在真机上部署移动自动化测试工具
- regex - std::regex 在运行时在内存位置抛出 Microsoft C++ 异常:std::regex_error
- ansible - 使用文件模块幂等的ansible更改权限
- excel - Excel 进程不关闭 VBA
- javascript - 使用 jQuery 创建一个过渡链
- javascript - 如何在没有 x 和 y 值的气泡图(chart.js)中显示数据?
- javascript - 画布 javascript