首页 > 解决方案 > 将熊猫数据框转换为仅选择某些列的 csv

问题描述

我想将熊猫数据框转换为仅选择某些列的 csv。但是,在新的熊猫更新之后,我收到了这个错误:

KeyError: 'Passing list-likes to .loc or [] with any missing labels is no longer supported

这是我的代码:

    #split into the correct columns
    split_data = df["Date,Country,City,Specie,count,min,max,median,variance"].str.split(",")
    data = split_data.to_list()
    names = ['Date', 'Country', 'City', 'Specie', 'count', 'min', 'max', 'median', 'variance']
    new_df = pd.DataFrame(data, columns=names)


    new_df.drop(['City', 'count', 'min', 'max', 'variance'], axis = 1)

   #calculating the mean
    mean_data = new_df.groupby(['Date', 'Country', 'Specie']).mean()


    clean_data = mean_data[(mean_data.T != 0).any()]

    bycountry_data = clean_data.groupby(['Date', 'Country', 'Specie']).mean()

    names = ['Date', 'Country', 'Specie', 'median']
    #convert to csv
    bycountry_data.to_csv('bycountry.csv',index=False, sep=";",columns = names)

这是我要转换为 csv 的数据框的第一行:

Date       Country Specie    median    
2014-12-29 AT      co        0.10
                   no2      15.78
                   pm10     20.80
                   pm25     69.50
                   so2       2.00

(如果您对如何从这个错误中改进我的代码设备有任何想法,请不要犹豫,因为我是 Python 新手。)

标签: pythonpandas

解决方案


你可以试试:

#split into the correct columns
split_data = df["Date,Country,City,Specie,count,min,max,median,variance"].str.split(",")
data = split_data.to_list()
names = ['Date', 'Country', 'City', 'Specie', 'count', 'min', 'max', 'median', 'variance']
new_df = pd.DataFrame(data, columns=names)


new_df = new_df.drop(['City', 'count', 'min', 'max', 'variance'], axis = 1)



#calculating the mean
mean_data = new_df.groupby(['Date', 'Country', 'Specie']).mean()

clean_data = mean_data[(mean_data.T != 0).any()]

bycountry_data = clean_data.groupby(['Date', 'Country', 'Specie']).mean()

names = ['Date', 'Country', 'Specie', 'median']
#convert to csv
bycountry_data.to_csv('bycountry.csv',index=False, sep=";",columns = names)

推荐阅读