首页 > 解决方案 > 熊猫数据框将所有数据列归为一列

问题描述

我正在尝试将一些数据加载到 pandas DataFrame 中,但 .txt 文件有点奇怪。它在前几个标题周围有引号,但其余部分没有,当我将它读入 pandas 数据框时,它会将所有数据和列名放入由“\t”分隔的第一列中,我认为这意味着 python 中的一个选项卡但为什么要这样读

这是从 .txt 文件中复制的几行数据

"Notes" "Cancer Sites"  "Cancer Sites Code" Mortality-Incidence Age-Adjusted Rate Ratio Death Counts    Mortality Population    Mortality Age-Adjusted Rate Incidence Counts    Incidence Population    Incidence Age-Adjusted Rate
    "All Cancer Sites Combined" "0" 0.385   176256  96127579    181.476 469603  96127579    470.919
    "Oral Cavity and Pharynx"   "20010-20100"   0.242   2521    96127579    2.527   10717   96127579    10.437
    "Lip"   "20010" 0.046   16  96127579    0.016   352 96127579    0.358

到目前为止,这是我的代码(仅供参考,无论我是否使用标题,它都会做同样的事情)

df = pd.read_fwf("United States and Puerto Rico Cancer Statistics.txt", headers = None)

当我打印时,df我把它作为标题......

"Notes" "Cancer Sites"  "Cancer Sites Code" Mortality-Incidence Age-Adjusted Rate Ratio Death Counts    Mortality   Population  Mortality   Age-Adjusted    Rate    Incidence   Counts  Incidence.1 Population.1    Incidence.2 Age-Adjusted.1  Rate.1

这是我绘制的前两行数据df

0   "All Cancer Sites Combined"\t"0"\t0.385\t17625...   NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1   "Oral Cavity and Pharynx"\t"20010-20100"\t0.24...   NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

标签: pythonpandasdataframe

解决方案


推荐阅读