首页 > 解决方案 > 通过R中的多个条件将一个变量转换为两个变量

问题描述

我的数据框如下所示:

df:
  patID   D1   D2   D3  Code Day Month Year
1    11 V581   G8 V581  5468   9     7 2017
2    12 V581 V588 V588 0905F   8     8 2017
3    13 V582 B678 B678 J8907   9     5 2016
4    14  G67 A568 A568 J8907   7     9 2016
5    15 V583 V584 V584 J9056   8     5 2016
6    16 V584 V586 V586 G0890   5     4 2017
7    17  G69 G789 A567 A0891   4    10 2017

如果从 D1、D2、D3 组合匹配以“V58”开头,我必须处理这些情况。此外,我需要从代码中获取值,但根据某些条件将它们拆分为变量,即 CodeA if ("J8907","J9056") 和 CodeB if ("0905F", "G0890")。

输出应如下所示:

  patID Diags CodeA CodeB     Date
1    11  V581  <NA>  <NA> 7/9/2017
2    12  V581 0905F  <NA> 8/8/2017
3    13  V582  <NA> J8907 5/9/2016
4    14  <NA>  <NA> J8907 9/7/2016
5    15  V583  <NA> J9056 5/8/2016
6    16  V584 G0890  <NA> 4/5/2017
7    12  V588 0905F  <NA> 8/8/2017
8    13  <NA>  <NA> J8907 5/9/2016
9    14  <NA>  <NA> J8907 9/7/2016
10   15  V584  <NA> J9056 5/8/2016
11   16  V586 G0890  <NA> 4/5/2017
12   11  V581  <NA>  <NA> 7/9/2017
13   12  V588 0905F  <NA> 8/8/2017
14   13  <NA> J8907 J8907 5/9/2016
15   14  <NA>  <NA> J8907 9/7/2016
16   15  V584  <NA> J9056 5/8/2016
17   16  V586 G0890  <NA> 4/5/2017

我试过了,但这不起作用(我采用了这种长格式方法,因为在原始数据中我有类似的变量集,如 D1:D3 并且需要类似地处理)

datnew <- bind_rows(
  dat %>% select(patID,Code:Year,ends_with("1")) %>% rename_all(str_remove, "1"),
  dat %>% select(patID,Code:Year,ends_with("2")) %>% rename_all(str_remove, "2"),
  dat %>% select(patID,Code:Year,ends_with("3")) %>% rename_all(str_remove, "3")
) %>%
  transmute(
    patID,
    diags = D,
    CodeA =Code,
    CodeB = Code,
    Dates = as.Date(paste0(Day,"/",Month, "/", Year),format="%d/%m/%Y")
  )%>%
  filter(
     grepl("V58",diags)|
     CodeA  %in%    c("0905F","G0890")|
     CodeB %in% c("J8907","J9056")
  )

Thanks in Advance!!

标签: rfilterdplyrtidyverse

解决方案


你可以试试 :

library(dplyr)
library(tidyr)

df %>%
  unite(Date, Day, Month, Year, sep = '/') %>%
  pivot_longer(cols = D1:D3, values_to = 'Diags') %>%
  mutate(Diags = replace(Diags, -grep('V58', Diags), NA), 
         CodeA = ifelse(Code %in% c("J8907","J9056"), Code, NA),
         CodeB = ifelse(Code %in% c("0905F", "G0890"), Code, NA)) %>%
  filter(!(is.na(Diags) & is.na(CodeA) & is.na(CodeB))) %>%
  select(-Code, -name)

# A tibble: 17 x 5
#   patID Date     Diags CodeA CodeB
#   <int> <chr>    <chr> <chr> <chr>
# 1    11 9/7/2017 V581  NA    NA   
# 2    11 9/7/2017 V581  NA    NA   
# 3    12 8/8/2017 V581  NA    0905F
# 4    12 8/8/2017 V588  NA    0905F
# 5    12 8/8/2017 V588  NA    0905F
# 6    13 9/5/2016 V582  J8907 NA   
# 7    13 9/5/2016 NA    J8907 NA   
# 8    13 9/5/2016 NA    J8907 NA   
# 9    14 7/9/2016 NA    J8907 NA   
#10    14 7/9/2016 NA    J8907 NA   
#11    14 7/9/2016 NA    J8907 NA   
#12    15 8/5/2016 V583  J9056 NA   
#13    15 8/5/2016 V584  J9056 NA   
#14    15 8/5/2016 V584  J9056 NA   
#15    16 5/4/2017 V584  NA    G0890
#16    16 5/4/2017 V586  NA    G0890
#17    16 5/4/2017 V586  NA    G0890

推荐阅读