首页 > 解决方案 > 计算行中值的频率(计算排名表中的第一、第二和第三名的数量)

问题描述

我有一个体育联盟的数据框(实际数据是约 70 支球队和约 10 场比赛):

data <- data.frame(team = c("a","b","c","d"),
                   g1_placement = c(1,2,3,4), 
                   g2_placement = c(1,3,4,2),
                   g3_placement = c(2,1,4,3))

team球队名称在哪里,g*_placement是第 1-3 场比赛的球队位置:

  team g1_placement g2_placement g3_placement
1    a            1            1            2
2    b            2            3            1
3    c            3            4            4
4    d            4            2            3

我想计算每次第一,第二和第三名的数量,最终结果是:

  team g1_placement g2_placement g3_placement first second third
1    a            1            1            2     2      1     0
2    b            2            3            1     1      1     1
3    c            3            4            4     0      0     1
4    d            4            2            3     0      1     1

标签: rfrequency

解决方案


您可以获取长格式的数据,count每个位置的每个位置team,仅保留 1st 3 值并再次获取宽格式的数据。

library(dplyr)
library(tidyr)

data %>%
  inner_join(data %>%
  pivot_longer(cols = ends_with('_placement')) %>%
  count(team, value) %>%
  filter(value %in% 1:3) %>%
  mutate(value = c('first', 'second', 'third')[value]) %>%
  pivot_wider(names_from = value, values_from = n, values_fill = 0), by = 'team')

#  team g1_placement g2_placement g3_placement first second third
#1    a            1            1            2     2      1     0
#2    b            2            3            1     1      1     1
#3    c            3            4            4     0      0     1
#4    d            4            2            3     0      1     1

推荐阅读