首页 > 解决方案 > R:分析应收账款是否随着时间的推移重复出现

问题描述

我正在尝试查找在时间序列数据集中提供定期发票的帐户/客户。示例输入如下:

yearmonth <- c("2019-11", "2019-11", "2019-12", "2020-01", "2020-02", "2020-02", "2020-03", "2020-03", "2020-04", "2020-05")
receivables <- c("Cust A", "Cust B", "Cust A", "Cust A", "Cust B", "Cust C", "Cust A", "Cust B", "Cust D", "Cust E")
category_group_name <- c("Expense", "Expense", "Expense", "Expense", "Expense", "Expense","Expense", "Expense","Expense","Expense")

现在我要创建的是一个变异的 category_group_name,其中经常性发票被归类为“固定费用”,而一次性发票被归类为“可变费用”。

我在这里有点卡住了,有没有人可以帮忙?

非常感谢!

标签: r

解决方案


这个回答:

> dat %>% group_by(receivables) %>% mutate(recurring = n()) %>% mutate(category_group_name = case_when(recurring > 1 ~ "Fixed Expense", TRUE ~ "Variable Expense")) %>% select(-recurring)
# A tibble: 10 x 3
# Groups:   receivables [5]
   yearmonth receivables category_group_name
   <chr>     <chr>       <chr>              
 1 2019-11   Cust A      Fixed Expense      
 2 2019-11   Cust B      Fixed Expense      
 3 2019-12   Cust A      Fixed Expense      
 4 2020-01   Cust A      Fixed Expense      
 5 2020-02   Cust B      Fixed Expense      
 6 2020-02   Cust C      Variable Expense   
 7 2020-03   Cust A      Fixed Expense      
 8 2020-03   Cust B      Fixed Expense      
 9 2020-04   Cust D      Variable Expense   
10 2020-05   Cust E      Variable Expense   
> 

使用的数据:

> dat
   yearmonth receivables category_group_name
1    2019-11      Cust A             Expense
2    2019-11      Cust B             Expense
3    2019-12      Cust A             Expense
4    2020-01      Cust A             Expense
5    2020-02      Cust B             Expense
6    2020-02      Cust C             Expense
7    2020-03      Cust A             Expense
8    2020-03      Cust B             Expense
9    2020-04      Cust D             Expense
10   2020-05      Cust E             Expense
> dput(dat)
structure(list(yearmonth = c("2019-11", "2019-11", "2019-12", 
"2020-01", "2020-02", "2020-02", "2020-03", "2020-03", "2020-04", 
"2020-05"), receivables = c("Cust A", "Cust B", "Cust A", "Cust A", 
"Cust B", "Cust C", "Cust A", "Cust B", "Cust D", "Cust E"), 
    category_group_name = c("Expense", "Expense", "Expense", 
    "Expense", "Expense", "Expense", "Expense", "Expense", "Expense", 
    "Expense")), class = "data.frame", row.names = c(NA, -10L
))
> 

推荐阅读