首页 > 解决方案 > Carriage return character not being matched in Swift

问题描述

I'm trying to parse a file that (apparently) ends its lines with carriage returns, but they aren't being matched as such in Swift, despite having the same UTF8 value. I can see possible fixes for the problem, but I'm curious as to what these characters actually are.

Here's some sample code, with the output below. (CR is set using Character("\r"), although I've tried it using "\r" as well.

try f.forEach() { c in
            print(c, terminator:" ") // DBG
            if (c == "\r") {
                print("Carriage return found!")
            }
            print(String(c).utf8.first!, terminator:" ")//DBG
            print(String(describing:pstate)) // DBG
            ...
            case .field:
                switch c {
                case CR,LF :
                    self.endline()
                    pstate = .eol

When it reaches the end of line (which shows up as such in my text editors), I get this:

. 46 field
0 48 field

 13 field
I 73 field

It doesn't seem to be matching using == or in the switch statement. Is there another approach I should be using for this character?

(I'll note that the parsing works fine with files that terminate in newlines.)

标签: swiftcharacter-encoding

解决方案


我确定了问题所在。通过查看c.unicodeScalars我发现行尾字符实际上是“\r\n”,而不仅仅是“\r”。从我的代码中可以看出,我只在将其打印为 UTF-8 时才采用第一个。我不知道这是来自 String.forEach 还是文件本身。

我知道有一些测试可以确定某些东西是否是换行符。Swift 5 直接拥有它们 ( c.isNewline),并且还有 Bill Nattaner 指出的 CharacterSet 方法。

我对可以在我的 switch 语句中使用的东西更满意(因此我将明确定义每一个),但如果我希望处理更多种类的文件,这可能会改变。


推荐阅读