r - ntile 函数在最新版本的 R 中不起作用
问题描述
我的数据是
my_basket <- data.frame(ITEM_GROUP = c("Fruit","Fruit","Fruit","Fruit","Fruit","Vegetable","Vegetable","Vegetable","Vegetable","Dairy","Dairy","Dairy","Dairy","Dairy"),
ITEM_NAME = c("Apple","Banana","Orange","Mango","Papaya","Carrot","Potato","Brinjal","Raddish","Milk","Curd","Cheese","Milk","Paneer"),
Price = c(100,80,80,90,65,70,60,70,25,60,40,35,50,120))
我想使用 ntile 函数计算百分位数列
df1 = mutate(my_basket, percentile_rank = ntile(my_basket$Price,100))
它应该给我一个看起来像 correct_df 的数据框
correct_df<- data.frame(ITEM_GROUP = c("Fruit","Fruit","Fruit","Fruit","Fruit","Vegetable","Vegetable","Vegetable","Vegetable","Dairy","Dairy","Dairy","Dairy","Dairy"),
ITEM_NAME = c("Apple","Banana","Orange","Mango","Papaya","Carrot","Potato","Brinjal","Raddish","Milk","Curd","Cheese","Milk","Paneer"),
Price = c(100,80,80,90,65,70,60,70,25,60,40,35,50,120),
percentile_rank=c(86,65,72,79,43,51,29,58,1,36,15,8,22,93))
但是相反,我得到了一个看起来像 wrong_df 的数据框
wrong_df<- data.frame(ITEM_GROUP = c("Fruit","Fruit","Fruit","Fruit","Fruit","Vegetable","Vegetable","Vegetable","Vegetable","Dairy","Dairy","Dairy","Dairy","Dairy"),
ITEM_NAME = c("Apple","Banana","Orange","Mango","Papaya","Carrot","Potato","Brinjal","Raddish","Milk","Curd","Cheese","Milk","Paneer"),
Price = c(100,80,80,90,65,70,60,70,25,60,40,35,50,120),
percentile_rank=c(13,10,11,12,7,8,5,9,1,6,3,3,4,14))
自从我将 R 版本更新到 4.0.2 后才出现此问题
解决方案
我不认为这是一个 R 问题,但似乎是dplyr
1.0.0 的问题,正如这个开放的 GitHub 问题中提到的那样。查看从那里获取的两个函数的输出差异。
ntile_083(my_basket$Price,100)
#[1] 86 65 72 79 43 51 29 58 1 36 15 8 22 93
ntile_100(my_basket$Price,100)
#[1] 13 10 11 12 7 8 5 9 1 6 3 2 4 14
您现在可以使用ntile_083
来获取以前的功能。
推荐阅读
- operator-overloading - 如何获得两个浮点数的平均值
- delphi - Indy FTP 突然断开连接
- python - Cygwin 编译错误(windows 10、python 3.7、visual studio 2019)
- python - 使用 PyLint 计算圈复杂度
- javascript - JointJS 版本 3 中的命名空间问题
- python - 在python脚本内循环执行awk
- php - 如何在 Where 子句中记录没有敏感数据的 SQL 查询?
- mysql - MySQL存储值列表
- mysql - 如何使用 C# Visual Studio 执行 MySql 存储过程
- javascript - 是否可以在使用 console.log 将其发送到服务器时查看此 json 数据?