r - R合并部分匹配
问题描述
对此有很多答案,但我没有发现我正在处理的问题。
我有2个数据框:
df1:
df2:
setA <- read.table("df1.txt",sep="\t", header=TRUE)
setB <- read.table("df2.txt",sep="\t", header=TRUE)
所以,我想按列值匹配行:
library(data.table)
setC <-merge(setA, setB, by.x = "name", by.y = "name", all.x = FALSE)
我得到这个输出:
df3:
因为在 df 我也有 de 值 1,但用“;”分隔。我怎样才能得到欲望输出?
谢谢!!
解决方案
将来请应用函数 dput(df1) 和 dput(df2) 并将控制台的输出复制并粘贴到您的问题中。
Base R 对两部分问题的解决方案:
# First unstack the 1;7 row into two separate rows:
name_split <- strsplit(df1$name, ";")
# If the values of last vector uniquely identify each row in the dataframe:
df_ro <- data.frame(name = unlist(name_split),
last = rep(df1$last, sapply(name_split, length)),
stringsAsFactors = FALSE)
# Left join to achieve the same result as first solution
# without specifically naming each vector:
df1_ro <- merge(df1[,names(df1) != "name"], df_ro, by = "last", all.x = TRUE)
# Then perform an inner join preventing a name space collision:
df3 <- merge(df1_ro, setNames(df2, paste0(names(df2), ".x")),
by.x = "name", by.y = "name.x")
# If you wanted to perform an inner join on all intersecting columns (returning
# no results because values in last and colour are different then):
df3 <- merge(df1_ro, df2, by = intersect(names(df1_ro), names(df2)))
数据:
df1 <- data.frame(name = c("1;7", "3", "4", "5"),
last = c("p", "q", "r", "s"),
colour = c("a", "s", "d", "f"), stringsAsFactors = FALSE)
df2 <- data.frame(name = c("1", "2", "3", "4"),
last = c("a", "b", "c", "d"),
colour = c("p", "q", "r", "s"), stringsAsFactors = FALSE)
推荐阅读
- python - 如何根据 if 语句获取文件名?
- sql - 用于获取特定实例不存在于另一个表中的记录的 SQL 查询
- c# - 如何在 Windows 窗体中动态创建标签页内容?
- rxjs - rxjs中过滤的发送队列
- r - 在 IMDB 更改 URL 请求的性质
- c# - 使用 Semaphore Slim 限制线程数
- json - 从 PowerShell 中的另一个 JSON 文件读取后,在 JSON 中添加/附加新的键值对
- networking - 通过 Proxy VPN 服务器转发 websocket
- firebase - 同时调用的 Firestore 查询“onSnapshot”不起作用(
- javascript - Discord.js 尝试向用户发送消息时出现“意外标识符”