首页 > 解决方案 > 为什么管道我的命令通过 | % {echo "$_"} 让 UTF-8 工作?

问题描述

在 Windows PowerShell 中,我使用chcp 65001并选择了一种包含所有我想要的字符的字体。

如果我显示一个 UTF-8 文件,type file.u8工作正常,我得到所需的字符。

如果我运行,myprogram.exe那么在第一个非 ASCII 字符之后没有输出(如果在chcp 65001此之前运行产生mojibake)。

如果我运行myprogram.exe > test.u8然后type test.u8运行,我会得到所需的输出。

所以我推断我可以绕过该文件(使用我有限的 PowerShell 知识!),myprogram.exe | % {echo "$_"}并且有效。因此,当 C++ 运行时直接与破坏 UTF-8 输出的控制台对话时,它似乎正在做一些特别的事情。

(如果我使用宽字符,我可以得到想要的输出,但我实际上并不想要 UTF-16 输出,我想要 UTF-8。我只想要打印调试信息的便利,而不需要额外的字符转换)

标签: windowspowershellunicodeutf-8

解决方案


In a comment exchange with @eryksun I realized I had overlooked an experiment: All of my attempts to use wide characters had been successful. So what if type and echo are actually capable of reading UTF-8 and outputting wide characters? So I redirected to a file:

myprogram.exe | % {echo "$_"} > test.txt

Now inspecting that text file it is detected as "UCS-2 LE BOM" by Notepad++. In fact, all of the cases that worked (type, all redirection into files, etc) all produced multi-byte characters. Even type foo.u8 > foo.txt shows the expected increase in size.

So the real issue is not my program (which is successfully outputting UTF-8) it's that there are several things capable of silently transforming that into something Windows likes.


推荐阅读