r - 按类别分组,然后找出类别之间的差异 [r]
问题描述
我正在计算 1995 年至 2015 年不同群体的平均就业率。然后计算群体之间平均就业率的差异。
这应该每年订购。
大多数时候,我尝试在 dplyr 中使用 summarise 函数,但失败了。
下面的代码是我设置的。
diff_in_diff <- Cps_total %>%
filter(age >= 19 & age <= 44) %>%
mutate(women_and_black_men = ifelse(female == 1 & marstat != 1 & nfchild == 0, "Single without children",
ifelse(female == 1 & marstat != 1 & nfchild > 0, "Single with children",
ifelse(female == 1 & marstat == 1 & nfchild == 0, "Married without children",
ifelse(female == 1 & marstat == 1 & nfchild > 0, "Married with children",
ifelse(female == 0 & wbhao == 2, "Black Men", "Otherwise Men"))))))
diff_in_diff_2 <- diff_in_diff %>%
filter(!is.na(empl)) %>%
group_by(year, women_and_black_men) %>%
summarize(mean_empl=mean(empl))
year | women_and_black_men | mean_empl
1995 | Black Men | 0.8772406
1995 | Married with children | 0.6810999
1995 | Married without children | 0.8227718
1995 | Otherwise Men | 0.9048232
1995 | Single with children | 0.8330486
1995 | Single without children | 0.8927759
1996 | Black Men | 0.8415265
1996 | Married with children | 0.6800505
1996 | Married without children | 0.8188101
1996 | Otherwise Men | 0.9035344
这就是我发现的。
但是,我想找到Single with children minus Black men
, Single with children minus Single without children
,Single with children minus Married with children
和Single with children minus Married without children
之间的差异值Single with children minus Otherwise Men
因此我的期望是:
year | Single_with_children_vs | diff_in_diff
1995 | vs_Married with children | 0.031230201
1995 | vs Married without children | -0.130002012
1995 | vs Single_without_children | -0.190230201
1995 | vs Black Men | 0.002030210
1996 |
.
.
.
像这样的东西。
解决方案
也许不是最优雅的解决方案,但这里有一个快速修复:
# I created a basic dataset similar to yours
diff_in_diff <- data.frame(year=rep(1995:1996,8)
, women_and_black_men = rep(c("married with children", "married
without children", "otherwise men", "single with children", "single without children", "black men", "married with children", "otherwise men"), 2)
, empl = abs(rnorm(16, 0, 0.5))
) %>% arrange(year)
# create a dataframe that is just single with children
diff_in_diff_single <- diff_in_diff %>%
filter(women_and_black_men == "single with children") %>%
dplyr::rename("single.emp" = empl)
# join with our original dataframe and take the difference
diff_in_diff %>%
full_join(diff_in_diff_single, by = c("year")) %>%
drop_na() %>%
group_by(year, women_and_black_men.x) %>%
mutate(diff = empl - single.emp)
推荐阅读
- c# - 如何从 .NET Standard 2.0 库中添加对 .NET Standard 2.0 dll 的引用?
- python-3.x - TypeError:应命名其他参数
_ ,得到“可空” - vue.js - 为什么服务人员会自动下载新内容,但不更新?
- azure - JWT 中存在 signInName,但我无法使用它登录
- tensorflow - 在 TensorFlow Keras API 中,如何向 CSVLogger 回调创建的 csv 文件添加参数?
- svelte - Rollup/Sapper/Svelte:编译后获取路由列表和所有子组件
- algorithm - 处理循环群的算法
- r - 使用 R 标准化日期格式
- microsoft-graph-api - Azure AD B2C:Mcrosoft Graph API:将用户分配给组
- android - 将 ScaleType.MATRIX 设置为 Custom ImageView 会缩小图像的初始显示