首页 > 解决方案 > IO.pipe(Encoding::BINARY, Encoding::BINARY) 因 UndefinedConversionError 而失败,但仅在 Rails 下

问题描述

我有一些代码用于将http.rb的基于块的响应正文流转换为普通的IO.

def stream_response_body(body)
  IO.pipe(Encoding::BINARY, Encoding::BINARY) do |rd, wr|
    t = copying_thread(body, wr)
    yield rd
  ensure
    t.join if t
  end
end

def copying_thread(body, dst)
  Thread.new do
    body.each { |chunk| dst.write(chunk) }
  rescue StandardError => e
    UCBLIT::TIND.logger.error(e)
  ensure
    dst.close
    Thread.exit
  end
end

当我从命令行脚本调用它时,它工作得很好,但是当我从 Rails 控制器调用它时,它会 dst.write(chunk)爆炸:

  Encoding::UndefinedConversionError ("\xE5" from ASCII-8BIT to UTF-8):
    /Users/david/.rvm/gems/ruby-2.7.2/bundler/gems/ucblit-tind-de599ab253cc/lib/ucblit/tind/api/api.rb:106:in `write'
    /Users/david/.rvm/gems/ruby-2.7.2/bundler/gems/ucblit-tind-de599ab253cc/lib/ucblit/tind/api/api.rb:106:in `block (2 levels) in copying_thread'

(脚本和 Rails 应用程序都在 macOS Catalina 上的 Ruby 2.7.2 下运行。)

我已将阅读代码剥离为逐字节阅读,以确保问题不是由某些下游库引起的:

response = HTTP.get(url, encoding: Encoding::BINARY)
status = response.status
raise(HTTP::ResponseError, status.to_s) unless status.success?

xml_str_io = StringIO.new
xml_str_io.set_encoding(Encoding::BINARY)

stream_response_body(response.body) do |body|
  while (b = body.read(1))
    xml_str_io.putc(b)
  end
end

为什么(以及在哪里!)ASCII-8BIT发生UTF-8转换?为什么只有在从 Rails 调用时?


更新:

我尝试了以下修改,但都没有奏效:

  1. 打包字节数组而不是原始字符串

    body.each do |chunk|
      byteStr = chunk.bytes.pack('C*')
      dst.write(byteStr)
    end
    
  2. 使用putc而不是write

       body.each do |chunk|
         chunk.bytes.each do |b|
           dst.putc(b)
         end
       end
    

有趣的是,在第二种情况下,我仍然write在回溯中看到:

  Encoding::UndefinedConversionError ("\xE5" from ASCII-8BIT to UTF-8):
    /Users/david/.rvm/gems/ruby-2.7.2/bundler/gems/ucblit-tind-de599ab253cc/lib/ucblit/tind/api/api.rb:108:in `write'
    /Users/david/.rvm/gems/ruby-2.7.2/bundler/gems/ucblit-tind-de599ab253cc/lib/ucblit/tind/api/api.rb:108:in `putc'
    /Users/david/.rvm/gems/ruby-2.7.2/bundler/gems/ucblit-tind-de599ab253cc/lib/ucblit/tind/api/api.rb:108:in `block (3 levels) in copying_thread'

我认为这个失败write(可能还有其他失败)在IO某处的 C 代码中?

标签: ruby-on-railsrubyencodingio

解决方案


Rails 将默认编码设置为 UTF8

Encoding.default_external = Encoding::UTF_8
Encoding.default_internal = Encoding::UTF_8

https://github.com/rails/rails/blob/291a3d2ef29a3842d1156ada7526f4ee60dd2b59/railties/lib/rails.rb#L22-L23

我相信您需要在编写器管道上设置编码,否则它将使用默认编码。

read_io, write_io = IO.pipe(Encoding::BINARY, Encoding::BINARY, binmode: true)
write_io.set_encoding(Encoding::BINARY)

write_io.write([serialized_object].pack('NA*'), encoding: 'BINARY')

推荐阅读