r - 通过匹配来自另一个数据帧的 id 添加一列
问题描述
我想寻求帮助!
我正在尝试向我的第一个数据框“数据集”添加一个新列“TotalSasaran”,其中包含 5000 行
唐加尔 | Divaksin1 | Divaksin2 | 总Divaksin | 普罗文西 |
---|---|---|---|---|
2-1-2021 | 1 | 2 | 3 | 亚齐 |
2021 年 5 月 1 日 | 1 | 2 | 3 | 亚齐 |
2-1-2021 | 2 | 2 | 3 | 巴厘岛 |
2021 年 4 月 1 日 | 2 | 2 | 3 | 巴厘岛 |
3-1-201 | 3 | 1 | 4 | 雅加达 |
6-1-201 | 3 | 1 | 4 | 雅加达 |
structure(list(tanggal = structure(1:6, .Label = c("2021-01-15",
"2021-01-17", "2021-01-18", "2021-01-19", "2021-01-20", "2021-01-21",
"2021-01-22", "2021-01-23", "2021-01-24", "2021-01-25", "2021-01-26",
"2021-01-27", "2021-01-28", "2021-01-29", "2021-01-30", "2021-01-31",
"2021-02-01", "2021-02-02", "2021-02-03", "2021-02-04", "2021-02-05",
"2021-02-06", "2021-02-07", "2021-02-08", "2021-02-09", "2021-02-10",
"2021-02-11", "2021-02-12", "2021-02-13", "2021-02-14", "2021-02-15",
"2021-02-16", "2021-02-17", "2021-02-18", "2021-02-19", "2021-02-20",
"2021-02-21", "2021-02-22", "2021-02-23", "2021-02-24", "2021-02-25",
"2021-02-26", "2021-02-27", "2021-02-28", "2021-03-01", "2021-03-02",
"2021-03-03", "2021-03-04", "2021-03-05", "2021-03-06", "2021-03-07",
"2021-03-08", "2021-03-09", "2021-03-10", "2021-03-11", "2021-03-12",
"2021-03-13", "2021-03-14", "2021-03-15", "2021-03-16", "2021-03-17",
"2021-03-18", "2021-03-19", "2021-03-20", "2021-03-21", "2021-03-22",
"2021-03-23", "2021-03-24", "2021-03-25", "2021-03-26", "2021-03-27",
"2021-03-28", "2021-03-29", "2021-03-30", "2021-03-31", "2021-04-01",
"2021-04-02", "2021-04-03", "2021-04-04", "2021-04-05", "2021-04-06",
"2021-04-07", "2021-04-08", "2021-04-09", "2021-04-10", "2021-04-11",
"2021-04-12", "2021-04-13", "2021-04-14", "2021-04-15", "2021-04-16",
"2021-04-17", "2021-04-18", "2021-04-19", "2021-04-20", "2021-04-21",
"2021-04-22", "2021-04-23", "2021-04-24", "2021-04-25", "2021-04-26",
"2021-04-27", "2021-04-28", "2021-04-29", "2021-04-30", "2021-05-01",
"2021-05-02", "2021-05-03", "2021-05-04", "2021-05-05", "2021-05-06",
"2021-05-07", "2021-05-08", "2021-05-09", "2021-05-10", "2021-05-11",
"2021-05-12", "2021-05-13", "2021-05-14", "2021-05-17", "2021-05-18",
"2021-05-19", "2021-05-20", "2021-05-21", "2021-05-22", "2021-05-23",
"2021-05-24", "2021-05-25", "2021-05-26", "2021-05-27", "2021-05-28",
"2021-05-29", "2021-05-30", "2021-05-31", "2021-06-01", "2021-06-02",
"2021-06-03", "2021-06-04", "2021-06-05", "2021-06-06", "2021-06-07",
"2021-06-08", "2021-06-09", "2021-06-10", "2021-01-13", "2021-01-14",
"2021-01-16", "2021-05-15", "2021-05-16"), class = "factor"),
divaksin_1 = c(35, 1, 13, 16, 36, 42), divaksin_2 = c(0,
0, 0, 0, 0, 0), total_divaksin = c(35, 1, 13, 16, 36, 42),
Provinsi = c("Aceh", "Aceh", "Aceh", "Aceh", "Aceh", "Aceh"
)), row.names = c(NA, 6L), class = "data.frame")
基于我的第二个 df 'Total_Sasaran' 中只有 34 行的值。
普罗文西 | 总撒撒拉 |
---|---|
亚齐 | 1000 |
巴厘岛 | 1500 |
雅加达 | 2000 |
structure(list(Provinsi = c("ACEH", "BALI", "BANTEN", "BENGKULU",
"DKI JAKARTA", "GORONTALO"), TotalSasaran = c(3898726, 2860037,
8838393, 1327824, 8815157, 784727)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
两个数据框都有一个 id 列“Provinsi”。我希望 R 可以识别数据集 $Provinsi 并与 Total_Sasaran$Provinsi 匹配以获取数据集中每一行的 TotalSasaran 值。
我尝试了几个代码,但没有返回预期的输出 1st code trial
dataset$TotalSasaran<-Total_Sasaran$TotalSasaran[match(dataset$Provinsi,Total_Sasaran$Provinsi)]
第二个代码:
dataset$TotalSasaran <- Total_Sasaran$TotalSasaran[Total_Sasaran$Provinsi %in% dataset$Provinsi]
第三个代码:
dataset2<-inner_join(dataset,Total_Sasaran, by="Provinsi")
这将返回 0 个观察结果,数据框中没有数据,我确保类匹配,“Provinsi”作为因子,“TotalSasaran”作为数字
第 4 名
dataset2<-merge(dataset, Total_Sasaran, by="Provinsi", all.x=TRUE)
这会将整个“TotalSasaran”列返回为 NA
解决方案
两个数据的大小写不一样,一个是大写,一个是小写。R 区分大小写,因此要么更改其中一个数据框,然后执行merge
.
df1$Provinsi <- toupper(df1$Provinsi)
result <- merge(df1, df2)
推荐阅读
- android - 如何检查文本字段是否包含颤动的字符串?
- go - gopkg.in/ini.v1 无法处理同一部分中的重复键
- python - 将格式化和突出显示的多行文本 (SQL) 粘贴到 PyCharm 中的字符串文字中
- f# - 如何遍历 Odata 实体模型图?
- linux - 将多个 CSV 文件合并到一个 CSV 文件中,并使用 unix shell 脚本或 unix awk 在最终的 CSV 文件中创建超级模式
- angular - 不能同时使用 HttpHeaders 和 WithCredentials 选项
- r - 根据其他列上的某些条件,使用其他行中的值更新某些行中的值
- c# - 保存后如何检查文件是否存在?
- javascript - 使用标签管理器使用 clientId 填充自定义 javascript 变量(在 clickfunnels 上)
- .net - 如何要求 dotnet build 命令生成 64 位工件