首页 > 解决方案 > 为什么只选择 1 列时,pandas 数据框返回 2 列

问题描述

在使用 matplotlib 创建一些图时,我发现了 pandas 的一个奇怪行为,当我只选择 1 列时,它返回 2。

import pandas as pd
import io

data = io.StringIO("""time_0,1,time_1,2,time_2,0,time_3,3
-0.002,-0.1225,-0.002,-0.0904,-0.002,0.0331,-0.002,0.,
0.0,-0.1225,0.,-0.0904,0.,0.0331,0.,0.,
0.002,-0.1224,0.002,-0.0904,0.002,0.0331,0.002,0.,
0.004,-0.1225,0.004,-0.0904,0.004,0.0331,0.004,0.,""")

df = pd.read_csv(data)
print(df["time_0"])

输出:

-0.002 -0.1225
0.000 -0.1225
0.002 -0.1224
0.004 -0.1225
名称:time_0,dtype:float64

它显示来自“time_0”和“1”列的值,但仅选择了“time_0”。这是错误还是功能?

标签: pythonpandas

解决方案


您的数据框仅返回一行,但它也将索引与“1”列相同

df
Out[3]: 
        time_0      1  time_1      2  time_2      0  time_3   3
-0.002 -0.1225 -0.002 -0.0904 -0.002  0.0331 -0.002     0.0 NaN
 0.000 -0.1225  0.000 -0.0904  0.000  0.0331  0.000     0.0 NaN
 0.002 -0.1224  0.002 -0.0904  0.002  0.0331  0.002     0.0 NaN
 0.004 -0.1225  0.004 -0.0904  0.004  0.0331  0.004     0.0 NaN

似乎它无意中将第一列作为索引......由于,每行中的额外内容,它将最后一列作为 nan 值......

尝试删除,

 import pandas as pd
 import io
 
 data = io.StringIO("""time_0,1,time_1,2,time_2,0,time_3,3
 -0.002,-0.1225,-0.002,-0.0904,-0.002,0.0331,-0.002,0.
 0.0,-0.1225,0.,-0.0904,0.,0.0331,0.,0.
 0.002,-0.1224,0.002,-0.0904,0.002,0.0331,0.002,0.
 0.004,-0.1225,0.004,-0.0904,0.004,0.0331,0.004,0.""")
 
 df = pd.read_csv(data)
 print(df["time_0"])

此代码将打印

0   -0.002
1    0.000
2    0.002
3    0.004
Name: time_0, dtype: float64

推荐阅读