首页 > 解决方案 > 尝试使用python删除列中的一些数据

问题描述

我试图删除我的world_rank专栏中的一些数据。以下是我是如何做到的。

这里预览:

     world_rank                        university_name  \
0             1                     Harvard University   
1             2     California Institute of Technology   
2             3  Massachusetts Institute of Technology   
3             4                    Stanford University   
4             5                   Princeton University   
...         ...                                    ...   
2598    601-800                    Yeungnam University   
2599    601-800            Yıldız Technical University   
2600    601-800               Yokohama City University   
2601    601-800           Yokohama National University   
2602    601-800                     Yuan Ze University   

                       country  teaching international  research  citations  \
0     United States of America      99.7          72.4      98.7       98.8   
1     United States of America      97.7          54.6      98.0       99.9   
2     United States of America      97.8          82.3      91.4       99.9   
3     United States of America      98.3          29.5      98.1       99.2   
4     United States of America      90.9          70.3      95.4       99.9   
...                        ...       ...           ...       ...        ...   
2598               South Korea      18.6          24.3      10.9       26.5   
2599                    Turkey      14.5          14.9       7.6       19.3   
2600                     Japan      24.0          16.1      10.2       36.4   
2601                     Japan      20.1          23.3      16.0       13.5   
2602                    Taiwan      16.2          17.7      18.3       28.6   

     income total_score  num_students  student_staff_ratio  \
0      34.5        96.1       20152.0                  8.9   
1      83.7          96        2243.0                  6.9   
2      87.5        95.6       11074.0                  9.0   
3      64.3        94.3       15596.0                  7.8   
4         -        94.2        7929.0                  8.4   
...     ...         ...           ...                  ...   
2598   35.4           -       21958.0                 15.3   
2599     44           -       31268.0                 28.7   
2600   37.9           -        4122.0                  3.7   
2601   40.4           -       10117.0                 12.1   
2602   39.8           -        8663.0                 20.6   

     international_students female_male_ratio  year  
0                       25%               NaN  2011  
1                       27%      33 : 67 : 00  2011  
2                       33%           37 : 63  2011  
3                       22%          42:58:00  2011  
4                       27%          45:55:00  2011  
...                     ...               ...   ...  
2598                     3%          48:52:00  2016  
2599                     2%           36 : 64  2016  
2600                     3%               NaN  2016  
2601                     8%           28 : 72  2016  
2602                     4%          43:57:00  2016  

[2603 rows x 14 columns]

然后我尝试删除以下数据:

data.drop(["201-225","226-250","251-275","276-300","350-400","301-350","351-400","401-500","501-600","601-800"], inplace = True ) 

但这种错误显示:

KeyError: "['201-225' '226-250' '251-275' '276-300' '350-400' '301-350' '351-400'\n '401-500' '501-600' '601-800'] not found in axis"

所以,有人可以帮我解决这个问题。您可以在此处获取数据集https://drive.google.com/file/d/1ozF5tX-JAWyy3YQd6_MbgsrGYDmi_W5_/view?usp=sharing

p/s:我是 phyton 新手,顺便说一句,我还是学生

标签: pythonpandasdataframe

解决方案


您可以使用 设置索引world_rank,然后它会正常工作:

data.set_index('world_rank', inplace = True)
data.drop(["201-225","226-250","251-275","276-300","350-400","301-350","351-400","401-500","501-600","601-800"], inplace = True )

#If you want, you can reset the index to the standard:
data.reset_index(inplace = True)

注意:我检查了原始数据集,其中还包含另外两个片段,您也可以考虑排除:['201-250', '251-300']


推荐阅读