首页 > 解决方案 > compare multiple vectors of different lengths

问题描述

I would like to extend what is working with compare multiple vectors of different lengths, count the elements that are the same, and print out those that are the same and different. I'd like to write a loop so that I can do pairwise comparisons of ten different vectors to find which is in common for each comparison, for all possible pairwise options. The main comparison part of the below pseudo-code is working, and from the previous post, but this is just to compare A and B, and I would like to compare A to C, A to #D, B to C, etc...

vectors to be compared: A, B, C, D, E, F, G, H, I, J
set global variable for first vector to be compared
set global variable for second vector to be compared

#vetors -- these are subsets of my real vectors, which are more like 50 - 200 elements long
A <- c("866_78", "1137_78", "1721_78", "1745_79", "1910_79", "1972_76", 
        "2776_77",  "3049_79", "3084_15",  "3995_78", "4066_33", "4431_15")
B <- c("866_78", "1137_78", "1721_78", "1745_79", "1910_79", "1972_76", 
       "2776_77",  
       "3049_79", "3084_15",  "3995_78")
C <- c("866_78", "1137_78", "1910_79", "1972_76", "2776_77",  
       "3049_79", "3084_14",  "3995_78", "4066_36", "4431_19", "4885_78")
D <- c("866_78", "1137_78", "1721_78", "1745_79", "1910_79", "1972_76", 
      "2773_77",  
       "3049_79", "3084_12",  "3995_78", "4066_36", "4431_19", "4885_78")
E <- c("866_78", "1137_78", "1721_78", "1745_79", "1910_79", "1972_76", 
      "2776_77",  
       "3049_79", "3084_17", "4431_19", "4885_78")
F <- c("868_78", "1137_78", "1721_78", "1745_79", "1910_79", "1972_76", 
       "2776_77",  
       "3049_79", "3084_15",  "3995_78", "4066_36", "4431_19", "4885_78")
G <- c("866_78", "1837_78", "1721_78", "1972_76", "2776_77",  
       "3049_79", "3084_15",  "3995_78", "4066_36", "4431_19", "4885_78")
H <- c("866_78", "1137_78", "1721_78", "1745_79", "1910_79", "1972_76", 
        "2776_77",  
       "3049_79", "3084_15",  "3995_78", "4066_36", "4431_19", "4885_78")
I <- c("866_78", "1137_28", "1721_78", "1745_79", "1910_79", "1972_76", 
       "2776_77",  
       "3995_78", "4066_36", "4431_19", "4885_78")
J <- c("866_78", "1137_78", "1721_78", "1745_79", "1910_79", "1972_76", 
       "2776_77",  
       "3049_79", "3084_18",  "3995_78", "4066_36", "4431_19", "4885_78")

for(i ???)
{
compare.SNPs <- function(A, B) {
  # consider only unique names
  A.u <- unique(A)
  B.u <- unique(B)
  common.A.B <- intersect(A.u, B.u)
  diff.A.B <- setdiff(A.u, B.u)
  diff.B.A <- setdiff(B.u, A.u)
  uncommon.A.B <- union(diff.A.B, diff.B.A)
  cat(paste0("The sets have ", length(common.A.B), " SNPs in common:"))
  print(common.A.B)
  print(paste0("The sets have ", length(uncommon.A.B), " SNPs not in 
  common:"))
  print(paste0("In the first set, but not in the second set:"))
  print(diff.A.B)
  print(paste0("Not in the first set, but in the second set:"))
  print(diff.B.A)

}
compare.SNPs(A,B)
}

Any guidance for example code to look at would be much appreciated.

Sincerely, Ella

标签: rbioinformatics

解决方案


xx<-combn(LETTERS[1:10],2)
for (i in 1:dim(xx)[2]) {
      cat(paste0("Comparing ", xx[1,i], " and ", xx[2,i],": "))
      compare.SNPs(get(xx[1,i]),get(xx[2,i]))
    }

(此外,没有理由将function()调用放在 for 循环内)。


推荐阅读