首页 > 解决方案 > 如何根据该行的第 1 列与该行的第 2 列相比,给某些行“点”

问题描述

我正在考虑创建一个算法,如果 views_per_hour 比 average_views_per_hour 大 2 倍,我给频道 5 分;如果它大 3 倍,我给该行 10 分,如果它大 4 倍,我给该行 20 分。我不太确定该怎么做,非常感谢一些帮助。

df = pd.DataFrame({'channel':['channel1','channel2','channel3','channel4'], 'views_per_hour_today':[300,500,2000,100], 'average_views_per_hour':[100,200,200,50],'points': [0,0,0,0] })

df.loc[:, 'average_views_per_hour'] *= 2
df['n=2'] = np.where((df['views_per_hour'] >= df['average_views_per_hour']) , 5, 0)

df.loc[:, 'average_views_per_hour'] *= 3
df['n=3'] = np.where((df['views_per_hour'] >= df['average_views_per_hour']) , 5, 0)

df.loc[:, 'average_views_per_hour'] *= 4
df['n=4'] = np.where((df['views_per_hour'] >= df['average_views_per_hour']) , 10, 0)

我希望能够将“点”列中每一行的 n=2、n=3、n=4 列的结果相加,但这些列始终显示 5 或 10,从不显示 0(代码认为views_per_hour 总是大于 average_views_per_hour,即使 average_views_per_hour 乘以一个大整数。)

标签: pythonpandasmultiplication

解决方案


有多种方法可以解决此类问题。您可以使用语法更简洁的numpy select,也可以定义一个函数并应用于数据框。

div = df['views_per_hour_today']/df['average_views_per_hour']
cond = [(div >= 2) & (div < 3), (div >= 3) & (div < 4), (div >= 4) ]
choice = [5, 10, 20]
df['points'] = np.select(cond, choice)


    channel     views_per_hour_today    average_views_per_hour  points
0   channel1    300                     100                     10
1   channel2    500                     200                     5
2   channel3    2000                    200                     20
3   channel4    100                     50                      5

推荐阅读