首页 > 解决方案 > pytest can't handle Unicode in doctest in README under Python 2.7

问题描述

I have a README.rst file containing several doctests for my Python library. They all work, except for the last doctest, which prints Unicode output, encoded in UTF-8:

Here is a failing example::

    >>> print(u'\xE5\xE9\xEE\xF8\xFC')
    åéîøü

(The use of print rather than just a string is very important to my actual use-case, as the real string contains embedded newlines and I need to show off how things on different lines are aligned.)

Running pytest README.rst works successfully with Python 3.6.5 and pytest 3.6.1, but under Python 2.7.10, it fails with a very long traceback that ends with:

input = 'åéîøü
', errors = 'strict'

    def decode(input, errors='strict'):
>       return codecs.utf_8_decode(input, errors, True)
E       UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: ordinal not in range(128)

/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_8.py:16: UnicodeEncodeError

Setting setenv = LC_ALL=en_US.UTF-8 in tox.ini and running under tox changes nothing; neither does adding doctest_encoding = utf-8 to the [pytest] section of tox.ini. I see no doctest options relevant to my plight. How do I get the test to run successfully under Python 2.7?

Update: The bug responsible for this problem has been fixed in pytest 3.6.2.

标签: pythonpython-2.7unicodepytestdoctest

解决方案


是的,print是罪魁祸首。异常中最有趣的部分是:

def getvalue(self):
    result = _SpoofOut.getvalue(self)
    if encoding:
        result = result.decode(encoding)

本地/lib/python2.7/site-packages/_pytest/doctest.py:509:

pytest尝试解码 unicode,因此 Python 尝试首先对其进行编码——但失败了。我认为这是 pytest 中的一个错误:它应该测试result是字节还是 unicode:

    if encoding and isinstance(result, bytes):
        result = result.decode(encoding)

请将其报告给 pytest 问题跟踪器。您可以测试修复,如果它有效,您可以发送拉取请求。


推荐阅读