首页 > 解决方案 > 根据来自另一列的空值在 pandas 中获取新列

问题描述

这就是我的数据框的样子:

account_type  picture                                          video 
twitter       NULL                                             NULL
twitter       https://pbs.twimg.com/media/EPlqKxKUEAARR_x.jpg  NULL
twitter       https://pbs.twimg.com/media/EPlqKxKUEAARR_x.jpg  https://video.twimg.com/a
twitch        NULL                                             https://twitch.tv/
instagram     https://scontent-lga3-1.cdninstagram.com         NULL
instagram     https://video-iad3-1.xx.fbcdn.net                https://www.instagram.com/p
facebook      https://graph.facebook.com/2                     NULL
facebook      NULL                                             https://www.facebook.com/t
youtube       https://i.ytimg.com/vi                           https://www.youtube.com/w

这就是我想让它看起来的:

account_type  picture                                          video                          post_type
twitter       NULL                                             NULL                           text
twitter       https://pbs.twimg.com/media/EPlqKxKUEAARR_x.jpg  NULL                           picture
twitter       https://pbs.twimg.com/media/EPlqKxKUEAARR_x.jpg  https://video.twimg.com/a      video
twitch        NULL                                             https://twitch.tv/             video
instagram     https://scontent-lga3-1.cdninstagram.com         NULL                           picture
instagram     https://video-iad3-1.xx.fbcdn.net                https://www.instagram.com/p    video
facebook      https://graph.facebook.com/2                     NULL                           picture
facebook      NULL                                             https://www.facebook.com/t     video
youtube       https://i.ytimg.com/vi                           https://www.youtube.com/w      video

基本上我试图将每一行分成图片/视频/文本。

For twitter, instagram 
    > if columns 'picture' and 'video are NULL,'post_type'= text 
    > if columns 'picture' is NOT NULL and 'video' is NULL, 'post_type'= picture  
    > if columns 'picture' is NOT NULL and 'video' is NOT NULL, 'post_type'= video 

for twitch, youtube 
    > if 'video' is NOT NULL ,'post_type' = video 

for facebook 
    > if 'video' is NULL ,'post_type' = picture 
    > if 'video' is NOT NULL ,'post_type' = video

我正在尝试根据 null/notnull 标准创建它。这是我尝试过的:

df['newtype'] = np.where(df['picture'].isnull(), '', 'picture')
df['newtype2'] = np.where(df['video'].isnull(), '', 'video')

但这会创建新列。我希望一列中的所有内容都具有指定的条件。请告诉我是否有更好的方法来做到这一点。

标签: pythonpandasnumpydataframe

解决方案


您可以使用该结构df.loc[condition, 'column']并编写您的

# twitter-instagram
df.loc[(df['account_type'].isin(['twitter', 'instagram'])) &
       df['video'].isnull() &
       df['picture'].isnull(), 'post_type'] = 'text'

df.loc[(df['account_type'].isin(['twitter', 'instagram'])) &
       df['video'].isnull() &
       ~df['picture'].isnull(), 'post_type'] = 'picture'

df.loc[(df['account_type'].isin(['twitter', 'instagram'])) &
       ~df['video'].isnull() &
       ~df['picture'].isnull(), 'post_type'] = 'video'

# twitch-youtube
df.loc[(df['account_type'].isin(['twitch', 'youtube'])) & ~df['video'].isnull(), 'post_type'] = 'video'

# facebook
df.loc[(df['account_type'] == 'facebook') & df['video'].isnull(), 'post_type'] = 'picture'
df.loc[(df['account_type'] == 'facebook') & ~df['video'].isnull(), 'post_type'] = 'video'

推荐阅读