r - 使用左表中的多个字段,右表中的一个字段合并R中的表
问题描述
我有两个要合并的数据框。第一个数据帧 (hospital_df) 包含一个病例 ID,然后是诊断 1、2、3、4 等的诊断代码。第二个数据帧是每个唯一诊断代码的描述,将代码与描述匹配:
id <- c(1, 1, 5, 5, 5, 40, 40, 99, 99, 99, 99)
principal_diagnosis <- c('AA', 'CS', 'AA', 'EG', 'SD', 'DF', 'DD', 'AA', 'DF', 'HG', 'SD')
other_diagnosis <- c('AB', 'CF', 'AD', 'AA', 'SS', 'DS', 'DF', 'AG', 'AB', 'FG', 'VV')
other_diagnosis2 <- c('XX', 'DD', 'SQ', 'SD', 'DF', 'CV', 'DA', 'FD', 'SS', 'WV', 'AA')
hospital_df <- data.frame(id,principal_diagnosis, other_diagnosis, other_diagnosis2)
principal_diagnosis_codes <- c('AA', 'CS', 'AA', 'EG', 'SD', 'DF', 'DD', 'AA', 'DF', 'HG', 'SD', 'AB', 'CF', 'AD', 'SS', 'DS', 'AG', 'FG', 'VV', 'XX', 'SQ', 'CV', 'DA', 'FD', 'WV')
length(principal_diagnosis_codes)
description <- c('disease1',
'disease2',
'disease3',
'disease4',
'disease5',
'disease6',
'disease7',
'disease8',
'disease9',
'disease10',
'disease11',
'disease12',
'disease13',
'disease14',
'disease15',
'disease16',
'disease17',
'disease18',
'disease19',
'disease20',
'disease21',
'disease22',
'disease23',
'disease24',
'disease25')
diagnosis_codes_df <- data.frame(principal_diagnosis_codes, description)
我希望输出看起来像:
ID principal_diagnosis other_diagnosis other_diagnosis2
1 disease1 disease12 disease20
1 disease2 disease13 disease7
5 disease1 disease14 disease21
我试过了:
mergedData <- merge(hospital_df, diagnosis_codes_df, by.x=c("principal_diagnosis"),
by.y=c("principal_diagnosis_codes"))
mergedData <- merge(mergedData, diagnosis_codes_df, by.x=c("other_diagnosis"),
by.y=c("principal_diagnosis_codes"))
mergedData <- merge(mergedData, diagnosis_codes_df, by.x=c("other_diagnosis2"),
by.y=c("principal_diagnosis_codes"))
但这似乎效率低下,并且还给了我 description、description.y、description.x 等,这有点令人困惑。谁能告诉我更好的方法?我是 R 的新手,所以详细的解释会很有帮助。提前致谢!
解决方案
Base-R 中的此代码使用列表中匹配的第一个诊断代码。
hospital_df[-1] <- sapply(unlist(hospital_df[-1]), function(x) diagnosis_codes_df$description[x == diagnosis_codes_df][1])
id principal_diagnosis other_diagnosis other_diagnosis2
1 1 disease1 disease12 disease20
2 1 disease2 disease13 disease7
3 5 disease1 disease14 disease21
4 5 disease4 disease1 disease5
5 5 disease5 disease15 disease6
6 40 disease6 disease16 disease22
7 40 disease7 disease6 disease23
8 99 disease1 disease17 disease24
9 99 disease6 disease12 disease15
10 99 disease10 disease18 disease25
11 99 disease5 disease19 disease1
推荐阅读
- html - DIV 下的 SVG 对事件的反应
- go - 删除 CSV 中的空行
- vb.net - Microsoft Access VB.NET 网址
- python - 如何在网页上找到滑块的 CSS 选择器?
- wolfram-mathematica - 如何使用 Mathematica 中的两个参数将 Map[] 应用于函数?
- r - 如何检查R中的两个文件
- python - 我怎样才能让这个 PyTorch 热图函数更快更高效?
- python - Python算法通过无人机扫描矩形区域并计算飞行时间
- python - 比较列表和字符串
- angular - 运行 Angular 时如何在 PhpStorm 中查看控制台日志