首页 > 解决方案 > Pandas 无法正确解析逗号分隔的文件

问题描述

我正在尝试解析 CSV 文件,但不知何故 pandas 无法识别分隔符/定界符。我查看了类似的回复,但我仍然没有设法正确解析我的文件(只有标题被正确解析)。

文件的每一行如下所示:https://drive.google.com/a/company.com/uc?export=download&id=10p-c0i2xtWBSvJ3OJV5pgEUarE1X,-1,"{""type"":""F03""}",0,0,"{}","{}"

我尝试过的代码如下:

In  [0]: import pandas as pd

In  [1]: data = pd.read_csv('file.csv', sep=',')
         data.head()
Out [1]: 

    filename          file_size   file_attributes    region_count    region_id   region_shape_attributes  region_attributes
0   https://drive...        NaN               NaN             NaN          NaN                       NaN                NaN
1   https://drive...        NaN               NaN             NaN          NaN                       NaN                NaN
2   https://drive...        NaN               NaN             NaN          NaN                       NaN                NaN
3   https://drive...        NaN               NaN             NaN          NaN                       NaN                NaN
4   https://drive...        NaN               NaN             NaN          NaN                       NaN                NaN

In  [2]: data['filename'][0]
Out [2]: 

'https://drive.google.com/a/company.com/uc?export=download&id=10p-c0i2xtWBSvJ3OJV5pgEUarE1X,-1,"{""type"":""F03""}",0,0,"{}","{}"'

标签: python-3.xpandascsv

解决方案


抱歉,我没有设法重现您的问题。data但是,您可以通过以下代码解析数据框中的列。

df = data[['filename']]
cols_to_extract = [
    'filename', 'file_size', 'file_attributes', 'region_count', 
    'region_id', 'region_shape_attributes', 'region_attributes']
df[cols_to_extract] = pd.DataFrame(df['filename'].str.split(',').tolist(), columns=cols_to_extract)
df.head()

输出应如下所示:

    file_name           file_size   file_attributes       region_count  region_id   region_shape_attributes  region_attributes
0   https://drive...          -1    "{""type"":""F03""}"             0          0   "{}"                     "{}"
1   https://drive...          -1    "{""type"":""F03""}"             0          0   "{}"                     "{}"
2   https://drive...          -1    "{""type"":""F03""}"             0          0   "{}"                     "{}"
3   https://drive...          -1    "{""type"":""F03""}"             0          0   "{}"                     "{}"
4   https://drive...          -1    "{""type"":""F03""}"             0          0   "{}"                     "{}"

我希望它会有所帮助。


推荐阅读