首页 > 解决方案 > 根据列表和数据框列之间的值匹配创建新数据框

问题描述

我正在尝试使用与旧数据框中列的值匹配的值列表来创建新数据框。同样对于新数据框,我想保留用于匹配的值列表中的顺序。这是我想要实现的示例:

#A list of values used for matching
time.new <- c(2, 3, 4, 3, 4, 5, 4, 5, 6)
#The old data frame which I would match on the column of **time.old**
old <- data.frame(time.old=1:10, y=rnorm(10))
   time.old        y
          1  0.20320
          2 -0.74696
          3 -0.73716
          4 -0.61959
          5  1.12733
          6  2.58322
          7 -0.08138
          8 -0.10436
          9 -0.13081
         10 -1.20050
#Here is the expected new data frame
       time        y
          2 -0.74696
          3 -0.73716
          4 -0.61959
          3 -0.73716
          4 -0.61959
          5  1.12733
          4 -0.61959
          5  1.12733
          6  2.58322

标签: rextractdata-manipulation

解决方案


尝试dplyrleft_join。首先,将 time.new 转换为数据框的一列:

library(tidyverse)
time.new <- c(2, 3, 4, 3, 4, 5, 4, 5, 6)
#The old data frame which I would match on the column of **time.old**
old <- data.frame(time.old=1:10, y=rnorm(10))

time.new <- data.frame(time=time.new) 
new_dataframe <- left_join(time.new, old, by=c("time"="time.old"))

在基础 R 中使用合并:

merge(x = time.new, y = old, by.x = "time", by.y="time.old", all.x = TRUE)

如果要保留 time.new 的顺序,则需要在数据中添加辅助行号列,合并,行号排序并删除 id 列:

time.new <- c(2, 3, 4, 3, 4, 5, 4, 5, 6)
old <- data.frame(time.old=1:10, y=rnorm(10))
    
time.new <- data.frame(id = 1:length(time.new), time=time.new)

new_dataframe <- merge(x = time.new, y = old, by.x = "time", by.y="time.old", all.x = TRUE)
new_dataframe <- new_dataframe[order(new_dataframe$id), ]
new_dataframe$id <- NULL

推荐阅读