首页 > 解决方案 > 使用循环连接数据框的列 - Python

问题描述

我想使用循环连接数据框的几列的值。

您可以找到实际的数据框:

 Artist_1                Artist_2   Artist_3
Lady Antebellum              ?         ?
Reba McEntire                ?         ?
Wanda Jackson                ?         ?
Carrie Underwood             ?         ?
       ?                     ?         ?
The Bellamy Brothers         ?         ?
Keith Urban          Miranda Lambert   ?
Sam Hunt                     ?         ?
Johnny Cash                  ?         ?
Johnny Cash            June Carter     ?
Highwaymen                   ?         ?
Loretta Lynn                 ?         ?
Sissy Spacek                 ?         ?
Loretta Lynn         Sheryl Crow    Miranda Lambert
Charley Pride                ?         ?

和预期的结果:

Artist
Lady Antebellum
Reba McEntire
Wanda Jackson
Carrie Underwood
?
The Bellamy Brothers
Keith Urban, Miranda Lambert
Sam Hunt
Johnny Cash
Johnny Cash, June Carter
Highwaymen
Loretta Lynn
Sissy Spacek
Loretta Lynn,  Sheryl Crow, Miranda Lambert
Charley Pride

标签: pythondataframeconcatenation

解决方案


这是使用pd.DataFrame.apply/str.join后跟的一种方法pd.Series.replace来解释不存在名称的实例:

import pandas as pd

df = pd.DataFrame({'Artist_1': ['A', 'B', '?', 'D', '?', 'E'],
                   'Artist_2': ['?', '?', '?', 'G', '?', 'I'],
                   'Artist_3': ['J', '?', '?', '?', 'M', 'N']})

df['Artist_All'] = df.apply(lambda x: ', '.join([i for i in x if i != '?']), axis=1)\
                     .replace('', '?')

print(df)

  Artist_1 Artist_2 Artist_3 Artist_All
0        A        ?        J       A, J
1        B        ?        ?          B
2        ?        ?        ?          ?
3        D        G        ?       D, G
4        ?        ?        M          M
5        E        I        N    E, I, N

或者,您可以使用列表推导:

df['Artist_All'] = [', '.join([i for i in x if i != '?']) for x in df.values]
df['Artist_All'] = df['Artist_All'].replace('', '?')

推荐阅读