r - How to sort bars according to 1 of 2 groups in a facet wrap?
问题描述
I hope someone can help me with the following problem: I would like to display the value (avg) of different laboratory parameters (parameter) of 2 different groups (gruppe). Additionally, I want to plot this information according to the change over time (performance) in 3 different facets. Here a tibble of the dataset:
# A tibble: 402 x 4
# Groups: gruppe, parameter [134]
gruppe parameter performance avg
<chr> <chr> <chr> <dbl>
1 DGE ACPA(citrull. Prot.-Ak) EIA/Se change_t1t0 NaN
2 DGE ACPA(citrull. Prot.-Ak) EIA/Se change_t2t0 37.6
3 DGE ACPA(citrull. Prot.-Ak) EIA/Se change_t3t0 NaN
4 Fasten Apolipoprot. A1 HP change_t1t0 41.2
5 DGE Apolipoprot. A1 HP change_t2t0 NaN
6 DGE Apolipoprot. A1 HP change_t3t0 NaN
7 DGE Apolipoprotein B change_t1t0 NaN
8 DGE Apolipoprotein B change_t2t0 NaN
9 Fasten Apolipoprotein B change_t3t0 NaN
10 DGE aPTT Pathromtin SL change_t1t0 0.571
# … with 392 more rows
This worked totally fine using this code:
#Create labels for 3 facets
lab_labels <- c("Change from Baseline to Day 7 [%]",
"Change from Baseline to Week 6 [%]",
"Change from Baseline to Week 12 [%]")
names(lab_labels) <- c("change_t1t0",
"change_t2t0",
"change_t3t0")
labor_summ_long %>%
filter(parameter %in% c("Hämatokrit (l/l)","Hämoglobin", "Leukozyten","MCV", "MCH", "MCHC", "RDW-CV", "Thromobzyten","MPV")) %>%
arrange(desc(avg))%>%
group_by(gruppe, performance)%>%
ggplot(aes(x=reorder(parameter,avg), y=avg, group=gruppe, fill = gruppe))+
geom_col(position = position_dodge())+
facet_wrap(~performance,
scales ="free_y",
dir="v",
labeller = labeller(performance = lab_labels))+
ylab("") +
xlab("") +
labs(color="", linetype="")+
theme_pubclean()+
theme(strip.background=element_rect(fill="lightgrey"),
strip.text = element_text(face="bold"),
legend.position = "bottom",
legend.title=element_blank())+
theme(axis.text.x = element_text(angle=45, hjust=1, vjust = 1))+
scale_x_discrete(labels = c("Hämoglobin"="Hemoglobin", "Leukozyten" = "Leucocytes",
"MCV", "MCH", "MCHC", "RDW-CV", "Thromobzyten"="Thrombocytes",
"MPV", "Hämatokrit (l/l)"="Hematocrite"))+
scale_fill_discrete(labels=c('DGE', "Fasten"='Fasting'))
This is how the plot looks like
What I am missing and am failing to find the solution to: I would like to order the bars...
- According to the avg-value from high to low
- of the Fasting-Group (blue bars)
- in the performance from baseline to day 7 (change_t1t0), aka the first facet.
I tricked around with arrange, sort, etc. but couldn't get all the conditions above together.
Do you have any ideas? Thanks a lot in advance!
解决方案
The issue is that reorder
reorders by taking the mean of all values for each parameter
without taking account of any grouping.
Adapting this answer to your case and making use of some random example data to mimic your real data this could be achieved like so:
The helper function reorder_where
allows to order the categories by an additional condition, e.g. in your case where gruppe == "Fasten" & performance == "change_t1t0"
is TRUE
library(dplyr)
library(ggplot2)
reorder_where <- function (x, by, where, fun = mean, ...) {
xx <- x[where]
byby <- by[where]
byby <- tapply(byby, xx, FUN = fun, ...)[x]
reorder(x, byby)
}
labor_summ_long %>%
filter(parameter %in% c("Hämatokrit (l/l)","Hämoglobin", "Leukozyten","MCV", "MCH", "MCHC", "RDW-CV", "Thromobzyten","MPV")) %>%
ggplot(aes(x=reorder_where(parameter, -avg, gruppe == "Fasten" & performance == "change_t1t0"), y=avg, group=gruppe, fill = gruppe))+
geom_col(position = position_dodge())+
facet_wrap(~performance,
scales ="free_y",
dir="v",
labeller = labeller(performance = lab_labels))+
ylab("") +
xlab("") +
labs(color="", linetype="")+
#theme_pubclean()+
theme(strip.background=element_rect(fill="lightgrey"),
strip.text = element_text(face="bold"),
legend.position = "bottom",
legend.title=element_blank())+
theme(axis.text.x = element_text(angle=45, hjust=1, vjust = 1))+
scale_x_discrete(labels = c("Hämoglobin"="Hemoglobin", "Leukozyten" = "Leucocytes",
"MCV", "MCH", "MCHC", "RDW-CV", "Thromobzyten"="Thrombocytes",
"MPV", "Hämatokrit (l/l)"="Hematocrite"))+
scale_fill_discrete(labels=c('DGE', "Fasten"='Fasting'))
DATA
set.seed(42)
labor_summ_long <- data.frame(
parameter = sample(c("Hämatokrit (l/l)","Hämoglobin", "Leukozyten","MCV", "MCH", "MCHC", "RDW-CV", "Thromobzyten","MPV"), 100, replace = TRUE),
gruppe = sample(c("DGE", "Fasten"), 100, replace = TRUE),
performance = sample(c("change_t1t0",
"change_t2t0",
"change_t3t0"), 100, replace = TRUE),
avg = runif(100, 0, 50)
)
labor_summ_long <- dplyr::distinct(labor_summ_long, parameter, gruppe, performance, .keep_all = TRUE)
推荐阅读
- apache-spark - Spark SQL:MovingAverage 不包括前 10% 和后 10%
- mysql - Symfony3-Doctrine : 按情况排序
- batch-file - 使用批处理文件创建目录时出现问题
- google-cloud-platform - 如何将文件从谷歌云存储桶复制到谷歌云虚拟机实例
- java - 在 cassandra 中为访问器准备查询时出错
- php - 如果键在文本中,则替换
- azure-devops - 无法从 Azure Devops 管道版本创建可执行文件 (.exe)
- symfony - 学说覆盖嵌入的列名
- python - 在 M 次实验中,我最后一次被选中的频率是多少
- xamarin.forms - Visual State Manager 和 Entry && 触发键盘 xamarin 表单的问题