r - 使用R中的if语句逐行比较csv
问题描述
我正在使用 R/Rstudio 比较两个 csv 文件,我想逐行比较它们,但是根据它们的列以特定的顺序进行比较。如果我的数据如下所示:
first <-read.csv(text="
name, number, description, version, manufacturer
A123, 12345, first piece, 1.0, fakemanufacturer
B107, 00001, second, 1.0, abcde parts
C203, 20000, third, NA, efgh parts
D123, 12000, another, 2.0, NA")
第二个csv:
second <- read.csv(text="
name, number, description, version, manufacturer
A123, 12345, first piece, 1.0, fakemanufacturer
B107, 00001, second, 1.0, abcde parts
C203, 20000, third, NA, efgh parts
E456, 45678, third, 2.0, ")
我想要一个看起来像这样的 for 循环:
for line in csv1:
if number exists in csv2:
if csv1$name == csv2$name:
if csv1$description == csv$description:
if csv1$manufacturer == csv2$manufacturer:
break
else:
add line to csv called changed, append a value for "changed" column to manufacturer
else:
add line to csv called changed, append a value for "changed" column to description
依此类推,输出看起来像:
name number description version manufacturer changed
A123 12345 first piece 1.0 fakemanufacturer number
B107 00001 second 1.0 abcde parts no change
C204 20000 third newmanufacturer number, manufacturer
D123 12000 another 2.0 removed
E456 45678 third 2.0 added
如果在这个循环中的任何一点不匹配,我想知道不匹配在哪里。这些行可以通过数字或描述进行匹配。例如,鉴于上面的 2 行,我将能够分辨出两个 csv 文件之间的数字发生了变化。提前感谢您的帮助!!
解决方案
它应该是这样的,但是由于您没有提供任何数据来测试它,我无法保证我的代码:
cmpDF <- function(DF1, DF2){
DF2 <- DF2[DF2$number %in% DF1$number,] #keep only the rows of DF2 that are
#also in DF1
retChar <- character(nrow(DF1))
names(retChar) <- DF1$number #call the retChar vector with the number
# to be able to update it later
DF1 <- DF1[DF1$number %in% DF2$number,]#keep only the rows of DF1 that are
#also in DF2
# sort rows to make sure that equal rows have the same row number:
DF1 <- DF1[order(DF1$number),]
DF2 <- DF2[order(DF2$number),]
equals <- DF1 == DF2
identical <- rowSums(DF1 == DF2) == ncol(DF1) #here all elements are the same
retChar[as.character(DF1$number[identical])] <- "no change"
for(i in 1:ncol(DF1)){
if(colnames(DF1)[i] == "number") next
different <- !equals[,i]
retChar[as.character(DF1$number[different])] <- ifelse(nchar(retChar[as.character(DF1$number[different])]),
paste0(retChar[as.character(DF1$number[different])], colnames(DF1)[i], sep = ", "),
colnames(DF1)[i])
}
retChar[nchar(retChar) == 0] <- "number not in DF2"
return(retChar)
}
推荐阅读
- r - 如何使用汇总来使用仅对其中一列进行的计算从多列中获取数据?
- build - 如何判断您正在查看哪种构建脚本
- r - 如何从R中的长字符串中顺序检索短字符串列表?
- ios - 为什么堆栈视图中的项目在 Swift 中占用太多空间
- java - 尝试在 Android Studio 上实现 RapidApi
- react-native - 如何在 TabNavigator 中将道具传递给屏幕?
- axapta - 如何查找 table.insert() 方法的引用?
- node.js - 分配数组的对象以进行续集
- node.js - npm 路径和包路径不一样
- css - React with Css 模块中的动态样式(非内联样式)