r - R geom_histogram position="identity" 不一致
问题描述
我目前在 R 中工作,试图创建一个图面板,每个图都包含两个重叠的直方图:蓝色直方图下方的红色直方图。红色直方图在每个图中包含相同的数据集,因此应该在整个板上一致地显示。我发现事实并非如此。尽管每个图中的数据完全相同,但红色直方图有所不同。有没有办法解决这个问题?我的代码中是否遗漏了导致这种不一致的内容?
这是我用来创建图的代码:
test<-rnorm(1000)
test<-as.data.table(test)
test[, type:="Sample"]
setnames(test, old="test", new="value")
test_2<-rnorm(750)
test_2<-as.data.table(test_2)
test_2[, type:="Sub Sample"]
setnames(test_2, old="test_2", new="value")
test_2_final<-rbind(test, test_2, fill=TRUE)
test_3<-rnorm(500)
test_3<-as.data.table(test_3)
test_3[, type:="Sub Sample"]
setnames(test_3, old="test_3", new="value")
test_3_final<-rbind(test, test_3, fill=TRUE)
test_4<-rnorm(250)
test_4<-as.data.table(test_4)
test_4[, type:="Sub Sample"]
setnames(test_4, old="test_4", new="value")
test_4_final<-rbind(test, test_4, fill=TRUE)
test_5<-rnorm(100)
test_5<-as.data.table(test_5)
test_5[, type:="Sub Sample"]
setnames(test_5, old="test_5", new="value")
test_5_final<-rbind(test, test_5, fill=TRUE)
test_6<-rnorm(50)
test_6<-as.data.table(test_6)
test_6[, type:="Sub Sample"]
setnames(test_6, old="test_6", new="value")
test_6_final<-rbind(test, test_6, fill=TRUE)
draws_750_p<-ggplot(data = test_2_final, aes(x=value, fill=type, color=type)) + geom_histogram(position="identity", alpha = 0.2, bins=30) + theme(plot.title = element_text(hjust = 0.5, size=10, face="plain"))
draws_500_p<-ggplot(data = test_3_final, aes(x=value, fill=type, color=type)) + geom_histogram(position="identity", alpha = 0.2, bins=30) + theme(plot.title = element_text(hjust = 0.5, size=10, face="plain"))
draws_250_p<-ggplot(data = test_4_final, aes(x=value, fill=type, color=type)) + geom_histogram(position="identity", alpha = 0.2, bins=30) + theme(plot.title = element_text(hjust = 0.5, size=10, face="plain"))
draws_100_p<-ggplot(data = test_5_final, aes(x=value, fill=type, color=type)) + geom_histogram(position="identity", alpha = 0.2, bins=30) + theme(plot.title = element_text(hjust = 0.5, size=10, face="plain"))
draws_50_p<-ggplot(data = test_6_final, aes(x=value, fill=type, color=type)) + geom_histogram(position="identity", alpha = 0.2, bins=30) + theme(plot.title = element_text(hjust = 0.5, size=10, face="plain"))
full_plot<-plot_grid(draws_750_p, draws_500_p, draws_250_p, draws_100_p, draws_50_p, ncol = 3, nrow = 2)
这是我正在描述的奇怪结果的图片:注意红色直方图的分布如何不同,尽管每个集合中的数据集完全相同(在此示例中,您可以在右侧的 draws_250_p 图中看到最多手角)-
解决方案
As I mentioned in a comment, the issue is that the bins being used are different for each plot. This means the same value can end up in a different bin. the default is to guess at reasonable bin boundaries based on the number of bins specified and the range of the data, but since the sub samples are different in each plot (and may start earlier or later than the main sample) the resulting boundaries will be different.
The solution is to specify the bin boundaries directly so they are the same in every plot. Here is an example of specifying the bin boundaries implicitly using a combination of binwidth
and boundary
. I have also taken the liberty of combining all of the values into a single dataframe so that they can be plotted at once using facet_wrap
, which has the advantage of aligning the axes of the individual facets and labelling them with the size of the subsample. The crucial point is in the call to geom_histogram
, though. You can hopefully see that the red distributions are the same in each facet now.
library(tidyverse)
test <- tibble(type = "Sample", value = rnorm(1000))
add_sub_sample <- function(n, df) {
sub_sample <- tibble(type = "Sub Sample", value = rnorm(n))
df %>%
rbind(sub_sample) %>%
mutate(sub_sample_n = n)
}
test_final <- c(750, 500, 250, 100, 50) %>%
map(add_sub_sample, test) %>%
bind_rows()
ggplot(test_final, aes(x = value, fill = type, colour = type)) +
geom_histogram(position = "identity", alpha = 0.2, binwidth = 0.2, boundary = 0) +
facet_wrap(~sub_sample_n) +
theme(plot.title = element_text(hjust = 0.5, size=10, face="plain"))
Created on 2021-07-14 by the reprex package (v1.0.0)
推荐阅读
- python-3.x - 如何在 while 循环中保持数据持久性,尤其是在将数据分配给函数中的变量时?
- reactjs - 我可以将 React.js 用于原生移动应用程序吗?
- html - 让可滚动的 Div Box 占据剩余高度
- c++ - 如何在rocksdb(make static_lib)期间修复这个编译错误?
- node.js - 如何在 graphql 查询中获取相关架构?
- liquibase - 从 liquibase 更改日志调用 oracle 过程时出错
- linux - 如果 Linux 终端中存在条件,则为其中一个输出字段着色
- c++ - CMake 和 FIND_PACKAGE
- c# - 使用 Hangfire 仪表板时使用会话进行身份验证
- javascript - 创建函数实例时是否执行代码?