首页 > 解决方案 > numpy.loadtxt() 如何处理文本编码?(它与以下错误有什么关系?)

问题描述

我正在尝试使用numpy.loadtxt. 这是我过去多次做过的事情,没有问题。但是,在生成一组要导入的新文本文件后,关于编码的某些内容肯定有所不同,因为在尝试运行以下代码时出现错误:

import numpy as np 

asdf = np.loadtxt('data/asdf.txt', skiprows=28, max_rows=720, usecols=range(1,722))

我收到的错误信息是:

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
/Users/iangullett/Desktop/coadd/coadd.py in <module>()
     61 
     62 
---> 63 test = np.loadtxt('data/asdf.txt')
     64 
     65 

/Users/iangullett/opt/anaconda2/lib/python2.7/site-packages/numpy/lib/npyio.pyc in loadtxt(fname, dtype, comments, delimiter, converters, skiprows, usecols, unpack, ndmin, encoding, max_rows)
   1091         try:
   1092             while not first_vals:
-> 1093                 first_line = next(fh)
   1094                 first_vals = split_line(first_line)
   1095         except StopIteration:

/Users/iangullett/opt/anaconda2/lib/python2.7/codecs.pyc in decode(self, input, final)
    312         # decode input (taking the buffer into account)
    313         data = self.buffer + input
--> 314         (result, consumed) = self._buffer_decode(data, self.errors, final)
    315         # keep undecoded input until the next call
    316         self.buffer = data[consumed:]

UnicodeDecodeError: 'utf8' codec can't decode byte 0xff in position 0: invalid start byte

作为参考,这是我试图阅读的文本文件的开头部分(实际上非​​常大):

Detector Viewer Listing

File : C:\file_path_hidden
Title: 
Date : 10/16/2019


Detector 6, NSCG Surface 1: 
Max polar angle: 90.00 deg, Total Hits = 224724030

Peak Intensity  : 3.957E+005 Watts/Steradian
Total Power     : 9.915E-001 Watts
Data Type       : Radiant Intensity
Maximum Angle   : 90.0000
Detector X      : 0.0000
Detector Y      : 0.0000
Detector Z      : 0.0000
Detector Tilt X : 0.0000
Detector Tilt Y : 180.0000
Detector Tilt Z : 0.0000
Units           : Watts/Steradian

Radial Pixels   : 721, increment 0.1250 degrees
Azimuthal Pixels: 720, increment 0.5000 degrees
Columns are radial angles, rows are azimuthal angles.

Power Values:
                 1           2           3           4           5           6           7           8           

任何帮助将不胜感激。

标签: pythonnumpy

解决方案


np.loadtxtencoding从 1.14.0 版本开始支持一个参数。它允许您手动设置编码。当第一个字节为 0xFF 时,可能会想到 UTF-16 之类的东西。但是,最好通过调查创建文件的程序来实际确定编码。


推荐阅读