首页 > 解决方案 > Keeping only common rows in all groups

问题描述

I have a dataset that contains ten groups. Some observations (rows) are missing from some groups. I want to keep only those observations that are common in each group. I try to make a minimal example. In that example, I have made three groups. In the first group, one observation is missing. Therefore output should me in each group there will be two observations.

library(tidyverse)
## data_set
test_df<-data.frame(groups=c(1,1,1,2,2,2,3,3,3),date=as.Date(c("2000-01-01","2000-01-02","2000-01-03","2000-01-01","2000-01-02","2000-01-03","2000-01-01","2000-01-02","2000-01-03")),data=c(1,2,NA,3,4,5,6,7,8))

## required_output
## keeping data only with common dates
test_df_new<-test_df[c(1,2,4,5,7,8),]   

## groups 
test_df_new<-test_df%>%
        group_by()%>%

标签: rdplyr

解决方案


首先,我在数据列中找到了带有 NA 的日期:

test_df$date[is.na(test_df$data)]

然后我通过 dplyr 过滤:

test_df %>% filter(date != test_df$date[is.na(test_df$data)])

推荐阅读