首页 > 解决方案 > Select top value within group return all

问题描述

Using the following data frame

df <- data.frame(id = c('A', 'A', 'A', 'B', 'B', 'B', 'B'),
                 gender = c('M', 'M', 'M', 'F', 'F', 'F', 'F'),
                 index = c(1, 2, 3, 1, 2, 3, 4))

And I need to take the max value of index with each id so I was thinking top_n function would work but I actually need the max value to be repeated for each id. So something like this is the result I need:

df_result <- data.frame(id = c('A', 'A', 'A', 'B', 'B', 'B', 'B'),
                 gender = c('M', 'M', 'M', 'F', 'F', 'F', 'F'),
                 index = c(1, 2, 3, 1, 2, 3, 4),
                 max_index = c(3, 3, 3, 4 ,4, 4, 4))

Is there something other than top_n that I can use, or use it but have it repeat?

标签: r

解决方案


I hope this is what you are looking for:

library(dplyr)

df %>%
  group_by(id) %>%
  mutate(max_index = max(index))

# A tibble: 7 x 4
# Groups:   id [2]
  id    gender index max_index
  <chr> <chr>  <dbl>     <dbl>
1 A     M          1         3
2 A     M          2         3
3 A     M          3         3
4 B     F          1         4
5 B     F          2         4
6 B     F          3         4
7 B     F          4         4

推荐阅读