r - Automatic matching of right column
问题描述
In R, having the following two dataframes:
df1 <- structure(list(id = 1:8, value = c(100L, 100L, 100L, 100L, 100L,
100L, 100L, 100L)), class = "data.frame", row.names = c(NA, -8L))
df2 <- structure(list(randomColumnName = c(100L, 2L, 3L, 40L, 5L, 6L, 73L, 8L), OtherRandomColumn = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "'test'", class = "factor"), value = c(100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L)), class = "data.frame", row.names = c(NA, -8L))
print(df1)
id value
1 1 100
2 2 100
3 3 100
4 4 100
5 5 100
6 6 100
7 7 100
8 8 100
print(df2)
randomColumnName OtherRandomColumn value
1 100 'test' 100
2 2 'test' 100
3 3 'test' 100
4 40 'test' 100
5 5 'test' 100
6 6 'test' 100
7 73 'test' 100
8 8 'test' 100
Dataframe 1 has a unique id column called 'id'. I know that df2 also has a column which contains some but not all id's. Upfront, I do not know the column name of the corresponding column in dataframe 2.
Question:
Is there any handy function or mechanism to check the first few entries (e.g. 50) of each column in df2, compare them to the id column of df, and return the column name of df2 that is most likely to match the ID column?
In this case, it should return that column 'randomColumnName' is the most likely column to match df1's ID column.
解决方案
You can try to loop over df2 and check each column against df1$id
. The column with the biggest sum is your column to merge on. Note that the output of that is a named vector so you can use either the name or its index.
i1 <- which.max(sapply(df2, function(i) sum(df1$id == i)))
names(i1)
#[1] "randomColumnName"
i1
#randomColumnName
# 1
推荐阅读
- laravel - 测试 Nuxt.js 前端和 Laravel 后端的方法
- mysql - 如何在一个语句中显示所有内容
- node.js - 为什么我在 Speech to text 代码中收到此错误?
- jenkins - 詹金斯:构建后删除回购,只保留日志
- c++ - 在 Visual Studio 2017 中使用 high_resolution_clock 获取错误的日期和时间/从纪元时间获取毫秒数(1970 年 1 月 1 日)
- python - 将未来的日期时间添加到 pandas df
- c++ - 如何让.exe程序通过c ++启动
- bash - 连接 $ 和 var --> $var 时不显示 var 的值
- react-redux - ag-grid 修改 Row Group 的 Header 名称
- javascript - 一页中有多个 XMLHttpRequest