首页 > 解决方案 > ggraph边缘连接错误?

问题描述

我正在生成一个分层边缘图,其中边缘的颜色/透明度/厚度因数据框中的列(pvalue)而异connect,但是我生成的图中边缘的颜色/透明度/厚度并不总是映射到列中的值(pvalue)。例如,subgroup1 和 subgroup4 应该具有最强的最粗连接(p 值为 E-280),而实际上它们没有,而是 subgroup3 和 subgroup4 之间的连接看起来最强。

在此处输入图像描述

此数据生成一个可重现的示例:

> dput(vertices)
structure(list(name = structure(c(3L, 1L, 2L, 4L, 5L, 6L, 7L), .Label = c("gp1", 
"gp2", "origin", "subgroup1", "subgroup2", "subgroup3", "subgroup4"
), class = "factor"), id = c(NA, NA, NA, 1L, 2L, 3L, 4L), angle = c(NA, 
NA, NA, 0, -90, 0, -90), hjust = c(NA, NA, NA, 1, 1, 1, 1)), row.names = c(NA, 
-7L), class = "data.frame")
> dput(hierarchy)
structure(list(from = structure(c(3L, 3L, 1L, 1L, 2L, 2L), .Label = c("gp1", 
"gp2", "origin"), class = "factor"), to = structure(1:6, .Label = c("gp1", 
"gp2", "subgroup1", "subgroup2", "subgroup3", "subgroup4"), class = "factor")), class = "data.frame", row.names = c(NA, 
-6L))
> dput(connect)
structure(list(from = structure(c(1L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L), .Label = c("subgroup1", "subgroup2", "subgroup3"), class = "factor"), 
    to = structure(c(1L, 2L, 2L, 1L, 3L, 3L, 3L, 3L), .Label = c("subgroup2", 
    "subgroup3", "subgroup4"), class = "factor"), pvalue = c(1.68e-204, 
    1.59e-121, 9.32e-73, 9.32e-73, 1.59e-21, 9.32e-50, 9.32e-40, 
    9.32e-280)), class = "data.frame", row.names = c(NA, -8L))

这是我用来制作此示例图的代码:

from <- match( connect$from, vertices$name)
to <- match( connect$to, vertices$name)
col <- connect$pvalue


#Let's add information concerning the label we are going to add: angle, horizontal adjustement and potential flip
#calculate the ANGLE of the labels
vertices$id <- NA
myleaves <- which(is.na( match(vertices$name, hierarchy$from) ))
nleaves <- length(myleaves)
vertices$id[ myleaves ] <- seq(1:nleaves)
vertices$angle <- 90 - 360 * vertices$id / nleaves
# calculate the alignment of labels: right or left
# If I am on the left part of the plot, my labels have currently an angle < -90
vertices$hjust <- ifelse( vertices$id < 41, 1, 0)
# flip angle BY to make them readable
vertices$angle <- ifelse(vertices$angle < -90, vertices$angle+180, vertices$angle)

mygraph <- graph_from_data_frame( hierarchy, vertices=vertices )

ggraph(mygraph, layout = 'dendrogram', circular = TRUE) + 
  geom_node_point(aes(filter = leaf, x = x*1.05, y=y*1.05), size = 2, alpha = 0.8) +
  geom_conn_bundle(data = get_con(from = from, to = to, col = col), aes(colour=col, alpha = col, width = col))  +
  geom_node_text(aes(x = x*1.1, y=y*1.1, filter = leaf, label=name, angle = angle, hjust=hjust), size=3.5, alpha=0.6) +scale_edge_color_continuous(trans = "log",low="red", high="yellow")+ scale_edge_alpha_continuous(trans = "log",range = c(1, 0.1)) +scale_edge_width_continuous(trans = "log", range = c(4, 1))+
  theme_void() 

我认为某处有错误的映射,但我不知道在哪里。非常感谢您的意见!

标签: rggplot2ggraph

解决方案


我相信这个库中有一个错误。按升序按选择的列(在我的情况下为 pvalue)重新排列输入数据有助于但没有解决问题。

connect_new <- arrange(connect, pvalue)

我在另一个用户提交的github 问题中找到了解决方案。每个组中的子组需要在层次结构和顶点文件中按字母顺序排列。此外,在连接数据框中,子组需要按照层次结构和顶点文件中的相同顺序进行排序。感谢 zhuxr11


推荐阅读