首页 > 解决方案 > How does Perl's length() function counts unicode characters?

问题描述

Why length() says this is 4 logical characters (I would expect it to say 1):

$ perl -lwe 'print length("")'
4

I guess something is wrong with my expectation. :-) What is it?

标签: perl

解决方案


除非你告诉 Perl 脚本的源代码是 utf8,否则 Perl 假定 ASCII。这意味着默认情况下 Perl 解释器将其视为 4 个单独的字符。如果您将一个衬里更改为perl -Mutf8 -lwe 'print length("")'You see length 提供您的预期输出。

utf8 pragma 告诉 Perl 源单元是 utf8 而不是 ASCII 。有关perldoc utf8更多信息,请参阅。


推荐阅读