首页 > 解决方案 > 如何按日期加入/合并多个数据框而不重复

问题描述

当我尝试将 3 个数据集与时间合并为 ID 时,我得到重复值,因为 id 是相同的。我想获得数据白化重复。

我的数据:https ://pastebin.com/5HAhQQG5

我曾尝试按 id 和聚合进行合并,但似乎没有任何效果,而且我一直在重复。

#Dati is the data-frame where I have all the data
Dati[, "...8"]
head(Dati,3)
bi       <- Dati[,1:3] 
bi_date  <- Dati[,1]
as       <- Dati[,5:7] 
as_date  <- Dati[,5]
tr       <- Dati[,9:11] 
tr_date  <- Dati[,9]
#i split the data frame into 3 diferent ones
bi$class <- "bid" 
as$class <- "ask" 
tr$class <- "trade" 
data.frame(bi)
data.frame(as)
data.frame(tr)
#rename the columns
colnames(bi)      <-  c("time", "price", "volume", "class") #Bid
colnames(as)      <-  c("time", "price", "volume", "class") #Ask
colnames(tr)      <-  c("time", "price", "volume", "class") #Trade


#currently i am trying to use this command but it does not work
mymergedata1 <- merge(x = bi, y = as, by = "time", all = TRUE)
mymergedata1 <- merge(x = mymergedata1, y = tr, by = "time", all = TRUE)

我希望它是这样的: https ://pastebin.com/pMt49yq4

我总是得到这样的东西:有没有人知道如何做到这一点,如果有,你能帮帮我吗?

标签: rdatabasedataframe

解决方案


替换答案,因为我误认为三列的时间是相同的......

library(plyr); library(dplyr)
Dati <- list(Dati[,1:3], Dati[,5:7], Dati[,9:11])
Dati <- ldply(Dati, function(x){
  names(x)[1] <- "time" 
  return(x)})


library(reshape2)
dm <- melt(Dati)
dm <- dm %>% na.exclude %>% dcast(time ~ variable, mean)
head(dm, 3)

#                  time Price_bid Volume_bid Price_ask Volume_ask Price_trade Volume_trade
# 1 05.07.2019 18:58:46     26.41         15     26.42          2         NaN          NaN
# 2 05.07.2019 18:58:50     26.41         15     26.43         14       26.42            2
# 3 05.07.2019 18:58:54     26.40          2     26.42          2         NaN          NaN

推荐阅读