首页 > 解决方案 > 根据 pandas 中另一列的值创建一个新列

问题描述

在数据框中,我有一个不同国家/地区名称的列,我想创建一个包含其区域的新列,例如该国家是印度,该区域应该是亚洲等。我已经尝试过使用 np.where,但看起来像我做错了什么。以下是我尝试过的代码:

Region = np.where(country_name == 'US' , "US", 
                 np.where(country_name == ('Brazil' or 'Canada' or 'Peru' or 'Chile') , "Rest of America", 
                 np.where(country_name == ('South Africa 'or 'Egypt' or 'Morocco' or 'Algeria' or 'Ghana'), "Africa", 
                 np.where(country_name == ('Afghanistan'or 'Armenia'or 'Azerbaijan' or 'Bahrain'or'Bangladesh'or 'Bhutan'or 
                                           'Brunei'or 'Burma'or 'Cambodia'or 'China'or 'East Timor' or
                                           'Georgia'or 'Hong Kong'or 'India' or 'Indonesia'or 'Iran' or 'Iraq'or 'Israel'or 'Japan'or
                                           'Jordan'or 'Kazakhstan'or 'Kuwait'or 'Kyrgyzstan'or 'Laos'or 
                                           'Lebanon'or 'Malaysia' or 'Mongolia'or 'Nepal'or 'North Korea'or 'Oman'or 'Pakistan'|
                                           'Papua New Guinea'or 'Philippines'or 'Qatar'or 'Russia'or 'Saudi Arabia'or 'Singapore'| 
                                           'South Korea'or 'Sri Lanka'or 'Syria'or 'Taiwan'or 'Tajikistan'or 'Thailand'or 'Turkey'or 'Turkmenistan'or
                                           'United Arab Emirates'or 'Uzbekistan'or 'Vietnam'or 'Yemen'), "Asia", 
                 np.where(country_name == ('Spain'or 'Italy' or 'Germany'or 'United Kingdom' or'France'), "Europe", "Unchange")))))

Below is the data:

     Entity        Region   Code       Date   Total confirmed deaths (deaths)   Total confirmed cases (cases)
0   Afghanistan     Asia    AFG     2019-12-31  0   0
1   Afghanistan     Asia    AFG     2020-01-01  0   0
2   Afghanistan     Asia    AFG     2020-01-02  0   0
3   Afghanistan     Asia    AFG     2020-01-03  0   0
4   Afghanistan     Asia    AFG     2020-01-04  0   0
5   Afghanistan     Asia    AFG     2020-01-05  0   0
6   Afghanistan     Asia    AFG     2020-01-06  0   0
7   Afghanistan     Asia    AFG     2020-01-07  0   0
8   Afghanistan     Asia    AFG     2020-01-08  0   0
9   Afghanistan     Asia    AFG     2020-01-09  0   0
10  Afghanistan     Asia    AFG     2020-01-10  0   0
11  Afghanistan     Asia    AFG     2020-01-11  0   0

但此代码仅适用于第一个国家,例如仅适用于巴西、南非、阿富汗和西班牙。

标签: pythonnumpy

解决方案


list_1 = ["Iceland", "Norway", "Sweden", "Finland","Denmark","United Kingdom", "Ireland",
              "France", "Belgium","Netherlands", "Luxembourg","Monaco", "Portugal", "Spain",
              "Andorra", "Italy","Malta","San Marino", "Vatican City", "Germany", 
              "Switzerland", "Liechtenstein"," Austria", "Poland", "Czech Republic", "Slovakia",
              "Hungary","Slovenia","Croatia", "Bosnia" ,"Herzegovina", "Serbia", "Montenegro", 
              "Albania", "Macedonia", "Romania", "Bulgaria","Greece", "Estonia", "Latvia", 
              "Lithuania", "Belarus", "Ukraine", "Moldova"]
    list_2 = ['Brazil' , 'Canada' , 'Peru' , 'Chile', 'South America']
    list_3 = ['Afghanistan', 'Armenia', 'Azerbaijan', 'Bahrain' ,'Bangladesh',  'Bhutan', 
              'Brunei', 'Burma', 'Cambodia', 'China', 'East Timor','Georgia',  'Hong Kong', 
              'India' , 'Indonesia', 'Iran' , 'Iraq' ,'Israel' , 'Japan','Jordan', 'Kazakhstan',
              'Kuwait' , 'Kyrgyzstan' , 'Laos', 'Lebanon', 'Malaysia' , 'Mongolia', 'Nepal', 
              'North Korea', 'Oman', 'Pakistan','Papua New Guinea', 'Philippines', 'Qatar', 
               'Saudi Arabia','Singapore', 'South Korea', 'Sri Lanka', 'Syria', 'Taiwan'
               'Tajikistan', 'Thailand', 'Turkey', 'Turkmenistan','United Arab Emirates', 
              'Uzbekistan', 'Vietnam', 'Yemen']
    list_4 = ['United States']
    list_5 = ['South Africa','Egypt' , 'Morocco' , 'Algeria' , 'Ghana', 'Africa', "Egypt"]



     conditions = [
            (df['Entity'].isin(list_4)),
            (df['Entity'].isin(list_2)),
            (df['Entity'].isin(list_5)),
            (df['Entity'].isin(list_3)),
            (df['Entity'].isin(list_1))
                     ]
        choices = ['US',"Rest of America","Africa","Asia","Europe"]
        Region = np.select(conditions, choices, default='Rest of the world')

推荐阅读