首页 > 解决方案 > 从数据框中提取多列并为不存在的列返回 NaN

问题描述

我正在尝试从如下所示的数据框中提取多列。我想通过调用它们的名称来识别所需的列,并为数据框中不存在的列返回 NaN。

data_1 = {'host_identity_verified':['t','t','t','t','t','t','t','t','t','t'],
      'neighbourhood':['q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q'],

      'neighbourhood_cleansed':['Oostelijk Havengebied - Indische Buurt', 'Centrum-Oost', 'Centrum-West', 'Centrum-West', 'Centrum-West',
                                'Oostelijk Havengebied - Indische Buurt', 'Centrum-Oost', 'Centrum-West', 'Centrum-West', 'Centrum-West'],
     'neighbourhood_group_cleansed': ['NaN','NaN','NaN','NaN','NaN','NaN','NaN','NaN','NaN','NaN'],
      'latitude':[ 52.36575, 52.36509, 52.37297, 52.38761, 52.36719, 52.36575, 52.36509, 52.37297, 52.38761, 52.36719]}

df_1 = pd.DataFrame(data_1)

我知道这种获取一列的方法:

x = df_1.get('neighbourhood_cleansed', pd.Series(index=df_1.index, name='neighbourhood_cleansed', dtype='object'))

但是我一次只能使用这种方法获得一列。

我想做类似的事情:

columns_needed = [['host_identity_verified', 'neighbourhood', 'latitude', 'longitude', 'price']]

# x= some code to get me the columns above and return NaN for columns such as 'longitude' and 'price.

标签: pythonpandasextract

解决方案


使用该reindex函数将创建naan列并提取您需要的列:

df_1.reindex(['host_identity_verified', 'neighbourhood', 'latitude', 'longitude', 'price'], axis=1)

推荐阅读