python - Python：应用程序的百分比/对其用户分析的任何记录

问题描述

大家下午好，我在下面有一个数据框。

UserId  Application
    1       apple
    1       orange
    1       apple
    1       pear
    2       apple
    2       orange
    2       pear
    2       grapefruit
    3       apple
    3       grapefruit
    3       apple
    1       apple

我正在尝试创建一个列表，将每个唯一应用程序计算到拥有它们的用户 ID 的百分比。作为输出的示例，下表如下

Application    Percentage
apple              100
orange             66
pear               66 
grapefruit         66

这个输出告诉我，对于每个用户，苹果出现 100% 的时间橙色出现在 66% 的时间。等等等等，但不知何故我无法让它工作。

我下面的代码有效，但产生 3.0 作为值。

dfsearch['Percentage'] = (len(dfsearch.Application.value_counts())/len(dfsearch.UserID.value_counts()))
dfsearch

这可能是不正确的，因为它不是一个列表，但这就是我需要帮助的原因:)

标签： pythonpandasnumpystatisticsanalysis

您可以从删除重复记录开始drop_duplicates，然后调用value_counts，除以唯一用户数并乘以 100：

x = df.drop_duplicates()['Application'].value_counts() / len(df['UserId'].unique()) * 100
x

输出：

apple         100.000000
pear           66.666667
grapefruit     66.666667
orange         66.666667
Name: Application, dtype: float64

然后将其转换为 DataFrame：

x.astype(int).to_frame('Percentage').rename_axis('Application').reset_index()

输出：

  Application  Percentage
0       apple         100
1        pear          66
2  grapefruit          66
3      orange          66

python - Python：应用程序的百分比/对其用户分析的任何记录

问题描述

解决方案

推荐阅读