首页 > 解决方案 > 我是否以正确的方式解决这个问题,如果在 group_by 之后

问题描述

我的df是这样的

Month<-c("May","May","May","May","May","May","May","May","June","June","June")
Manager<-c("Eric","Eric","Eric","Eric","Eric","Eric","Charlie","Charlie","Adam","Charlie","Adam")
`Order Type`<-c("Direct","Direct","Direct","Team Personal Use","Charity","Charity","Direct","Fatima","Fatima","Fatima","Direct")
`SVS ORD #`<-as.numeric(c("112231","112232","112233","112234","112235","112235","112236","112237","112238","112239","112240"))
Qty<-as.numeric(c("1","2","3","3","3","1","4","2","3","2","3"))

df <- data.frame(Month, Manager, `Order Type`, `SVS ORD #`, Qty)

我想在每个月为每个经理做些什么来检查他们订购了多少。但是,如果订单类型是DirectTeam Personal Use,请删除重复的 SVS ORD # 并计算唯一 SVS ORD # 的数量。其余的,它应该只取 SVS Ord # 的第一个 QTY 值。

我的代码是这样的。

df %>%
  group_by(Month,Manager) %>%
  mutate(`QTYMonthManager` = ifelse(`Order Type` == 'Direct' | `Order Type` == 'Team Personal Use',
                                         n_distinct(`SVS ORD #`),
                                         sum(Qty[!duplicated(`SVS ORD #`)])))

但是,这根本没有给我正确的答案。任何帮助表示赞赏!

期望的输出

Month<-c("May","May","May","May","May","May","May","May","June","June","June")
Manager<-c("Eric","Eric","Eric","Eric","Eric","Eric","Charlie","Charlie","Adam","Charlie","Adam")
`Order Type`<-c("Direct","Direct","Direct","Team Personal Use","Charity","Charity","Direct","Fatima","Fatima","Fatima","Direct")
`SVS ORD #`<-as.numeric(c("112231","112232","112233","112234","112235","112235","112236","112237","112238","112239","112240"))
Qty<-as.numeric(c("1","2","3","3","3","3","4","2","3","2","3"))
**Answer** <-as.numeric(c("7","7","7","7","7","7","3","3","4","2","4"))

标签: r

解决方案


您可以使用以下功能:

library(dplyr)
calculate_answer <- function(Order.Type, SVS.ORD.., Qty) {
   i <- Order.Type %in% c('Direct','Team Personal Use')
   n_distinct(SVS.ORD..[i]) + sum(Qty[!i][!duplicated(SVS.ORD..[!i])])
}

并将其应用于每个组。

df %>%
   group_by(Month, Manager) %>%
    mutate(Answer = calculate_answer(Order.Type, SVS.ORD.., Qty))

#  Month Manager Order.Type        SVS.ORD..   Qty Answer
#   <chr> <chr>   <chr>                 <dbl> <dbl>  <dbl>
# 1 May   Eric    Direct               112231     1      7
# 2 May   Eric    Direct               112232     2      7
# 3 May   Eric    Direct               112233     3      7
# 4 May   Eric    Team Personal Use    112234     3      7
# 5 May   Eric    Charity              112235     3      7
# 6 May   Eric    Charity              112235     1      7
# 7 May   Charlie Direct               112236     4      3
# 8 May   Charlie Fatima               112237     2      3
# 9 June  Adam    Fatima               112238     3      4
#10 June  Charlie Fatima               112239     2      2
#11 June  Adam    Direct               112240     3      4

推荐阅读