首页 > 解决方案 > 将列中前 20% 的最高值返回为 1,并将其余数字设为 0

问题描述

将列中前 20% 的最高值返回为 1,并将其余数字设为 0

东风

dat1 = data.frame(a = c(0.1,0.2,0.3,0.4,0.5), b = c(0.6,0.7,0.8,0.9,0.10), c = c(0.12,0.13,0.14,0.15,0.16), d = c(0.6,0.7,0.8,0.5,0.9), ID=c("Albert", "Bia", "Carla", "Duda", "Elisa"))

所需的DF

dat1 = data.frame(a = c(0,0,0,0,1), b = c(0,0,0,1,0), c = c(0,0,0,0,1), d = c(0,0,0,0,1), ID=c("Albert", "Bia", "Carla", "Duda", "Elisa"))

标签: rdataframesubsetdata-cleaning

解决方案


aplly与_quantile

dat1[,1:4] <- apply(dat1[,1:4], 2, function(x) ifelse(x>=quantile(x, probs = c(0.8, 1))[2],1,0))
output:
> dat1
  a b c d     ID
1 0 0 0 0 Albert
2 0 0 0 0    Bia
3 0 0 0 0  Carla
4 0 1 0 0   Duda
5 1 0 1 1  Elisa

推荐阅读