首页 > 解决方案 > 将每组的某些行的中位数除以其他行的中位数

问题描述

group_ID <- c("a","a","a","a","a","b","b","b","b","b","b","b","b")
class <- c("p","q","q","q","q","p","p","p","q","q","q","q","q")
var1 <- c(3,1,1,1,1,3,2,1,1,2,2,4,1)
my_table <- data.frame(group_ID,class,var1)

我有下表。

group_ID class var1
a     p    3
a     q    1
a     q    1
a     q    1
a     q    1
b     p    3
b     p    2
b     p    1
b     q    1
b     q    2
b     q    2
b     q    4
b     q    1

我想通过将每个组的 p 类 var1 的中值除以 q 类的中值 var1 来创建一个新列。预期输出如下所示。

group_ID    class   var1    var1_ratio
a   p   3   3
a   q   1   3
a   q   1   3
a   q   1   3
a   q   1   3
b   p   3   1
b   p   2   1
b   p   1   1
b   q   1   1
b   q   2   1
b   q   2   1
b   q   4   1
b   q   1   1

链接:这个问题似乎与我的最相似,我尝试使用group_by()mutate_each()如下,但我无法让它工作。

my_table <- my_table %>%
  group_by(group_ID,class) %>%
  mutate_each(funs(./median(.[class == "p"])), var1)

我也试过:Link1 Link2 Link3

谢谢!

标签: rdplyrbioinformatics

解决方案


我们不需要mutate_each

library(dplyr)
my_table %>% 
   # // grouped by group_ID, class
   group_by(group_ID, class) %>%
   # // create a median column
   mutate(Median= median(var1)) %>% 
   # // reset the grouping by removing class 
   group_by(group_ID) %>%
   # // divide the first element of subset of Median for each class
   mutate(var1_ratio = first(Median[class == 'p'])/first(Median[class == 'q']), 
         Median = NULL)
# A tibble: 13 x 4
# Groups:   group_ID [2]
#   group_ID class  var1 var1_ratio
#   <chr>    <chr> <dbl>      <dbl>
# 1 a        p         3          3
# 2 a        q         1          3
# 3 a        q         1          3
# 4 a        q         1          3
# 5 a        q         1          3
# 6 b        p         3          1
# 7 b        p         2          1
# 8 b        p         1          1
# 9 b        q         1          1
#10 b        q         2          1
#11 b        q         2          1
#12 b        q         4          1
#13 b        q         1          1

推荐阅读