首页 > 解决方案 > 用另一列的条件模式填充列

问题描述

鉴于下面的列表,我想以“类型”和“大小”为条件的“颜色”列的模式填充“颜色猜测”列,并忽略 NULL、#N/A 等。

例如,SMALL CATS 最常见的颜色是什么,MEDIUM DOGS 最常见的颜色是什么等。

Type  Size    Color   Color Guess
Cat   small   brown   
Dog   small   black   
Dog   large   black   
Cat   medium  white   
Cat   medium  #N/A    
Dog   large   brown   
Cat   large   white   
Cat   large   #N/A    
Dog   large   brown   
Dog   medium  #N/A    
Cat   small   #N/A    
Dog   small   white   
Dog   small   black   
Dog   small   brown   
Dog   medium  white   
Dog   medium  #N/A    
Cat   large   brown   
Dog   small   white   
Dog   large   #N/A

标签: pythonpandas

解决方案


正如 BarMar 在评论中已经说明的那样,我们可以pd.Series.mode从链接的答案中使用这里。这里唯一的技巧是,我们必须使用groupby.transform,因为我们希望数据恢复为与您的数据框相同的形状:

df['Color Guess'] = df.groupby(['Type', 'Size'])['Color'].transform(lambda x: pd.Series.mode(x)[0])

   Type    Size  Color Color Guess
0   Cat   small  brown       brown
1   Dog   small  black       black
2   Dog   large  black       brown
3   Cat  medium  white       white
4   Cat  medium    NaN       white
5   Dog   large  brown       brown
6   Cat   large  white       brown
7   Cat   large    NaN       brown
8   Dog   large  brown       brown
9   Dog  medium    NaN       white
10  Cat   small    NaN       brown
11  Dog   small  white       black
12  Dog   small  black       black
13  Dog   small  brown       black
14  Dog  medium  white       white
15  Dog  medium    NaN       white
16  Cat   large  brown       brown
17  Dog   small  white       black
18  Dog   large    NaN       brown

推荐阅读