首页 > 解决方案 > 使用 Python 在 Visual Studio Code 中显示日语字符

问题描述

标签: pythonpython-3.xbeautifulsouputf-8

解决方案


It's called Percent-encoding:

Percent-encoding, also known as URL encoding, is a method to encode arbitrary data in a Uniform Resource Identifier (URI) using only the limited US-ASCII characters legal within a URI.

Apply the unquote method from urllib.parse module:

urllib.parse.unquote(string, encoding='utf-8', errors='replace')

Replace %xx escapes by their single-character equivalent. The optional encoding and errors parameters specify how to decode percent-encoded sequences into Unicode characters, as accepted by the bytes.decode() method.

string must be a str. Changed in version 3.9: string parameter supports bytes and str objects (previously only str).

encoding defaults to 'utf-8'. errors defaults to 'replace', meaning invalid sequences are replaced by a placeholder character.

Example:

from urllib.parse import unquote
encodedUrl = 'JapaneseChars%E3%81%82or%E3%81%B3'
decodedUrl = unquote( encodedUrl )
print( decodedUrl )
JapaneseCharsあorび

One can apply the unquote method to almost any string, even if already decoded:

print( unquote(decodedUrl) )
JapaneseCharsあorび

推荐阅读