r - 计算组中一点的密度
问题描述
我正在绘制一些密度曲线,我想在每组的平均值处添加一个点。但是,我想沿着密度曲线的顶部绘制这些点,而不是在 0 处。有没有办法得出组内平均点的密度值?代码如下:
# make df
df<- data.frame(group=c("a","b",'c'),
value=rnorm(
3000,
mean=c(1,2,3),
sd=c(1,1.5,1)
))
library(tidyverse)
library(ggridges)
library(ggdist)
方式1:来自ggridges ppackage的密度脊
df %>%
# calculate mean density per group to use later
group_by(group)%>%
mutate(mean_value=mean(value)) %>%
ggplot()+
aes(x=value,y=group)+
geom_density_ridges()+
# could do with stat summary - blue points
stat_summary(
orientation = "y",
fun = mean,
geom = "point",
color="blue"
)+
# or could do with geom_point using precalculated value (red points)
# nudged so we can see both.
geom_point(aes(x=mean_value,y=group),
color="red",
position = position_nudge(x=.1)
)
方式2:来自ggdist包的stat_halfeye
df %>%
group_by(group)%>%
mutate(mean_value=mean(value)) %>%
# mutate(mean_density = density(mean_value,value))
ggplot()+
aes(x=value,y=group)+
stat_halfeye()+
# could do with stat summary
stat_summary(
orientation = "y",
fun = mean,
geom = "point",
color="blue",
alpha=.8
)+
# or could do with geom_point using precalculated value
# nudged so we can see both.
geom_point(aes(x=mean_value,y=group),
color="red",
position = position_nudge(x=.1)
)
期望的输出:使这些蓝色或红色点位于密度曲线的顶部。所以我需要一种类似于“群体+密度值”的美学。
宁愿使用方式 2 (ggdist) 而不是 geom_density ridges
谢谢
解决方案
我不确定是否有办法在 ggplot geom/stat 函数中计算密度曲线的平均值,所以我创建了几个辅助函数来做到这一点。
dens_at_mean
以数据的平均值计算密度曲线的高度。get_mean_coords
按组运行dens_at_mean
,然后缩放高度值以匹配由生成的 y 值stat_halfeye
并返回可以传递给geom_point
.
# Reproducible data
set.seed(394)
df<- data.frame(group=c("a","b",'c'),
value=rnorm(
3000,
mean=c(1,2,3),
sd=c(1,1.5,1)
))
# Function to get height of density curve at mean value
dens_at_mean = function(x) {
d = density(x)
mean.x = mean(x)
data.frame(mean.x = mean.x,
max.y = max(d$y),
mean.y = approx(d$x, d$y, xout=mean.x)$y)
}
# Function to return data frame with properly scaled heights
# to plot mean points
get_mean_coords = function(data, value.var, group.var) {
data %>%
group_by({{group.var}}) %>%
summarise(vals = list(dens_at_mean({{value.var}}))) %>%
ungroup %>%
unnest_wider(vals) %>%
# Scale y-value to work properly with stat_halfeye
mutate(mean.y = (mean.y/max(max.y) * 0.9 + 1:n())) %>%
select(-max.y)
}
df %>%
ggplot()+
aes(x=value, y=group)+
stat_halfeye() +
geom_point(data=get_mean_coords(df, value, group),
aes(x=mean.x, y=mean.y),
color="red", size=2) +
theme_bw() +
scale_y_discrete(expand=c(0.08,0.05))
推荐阅读
- java - 在指定日期设置桌面通知
- r - 将来自符合模式并忽略 NA 的名称的字符串粘贴在一起
- c# - 使用包含数组的嵌套类进行 JSON 到 C# 的转换
- python - 使用理解创建值是关键的字典
- swift - 如何在 heighForRowAt indexPath 中设置自动高度?
- image - 当我使用 require 时图像未在 React Native 中加载,但在我从 URL 加载时加载
- c++ - C++ - 无法在三元运算符内调用内联 std::cerr
- azure - 如何解释天蓝色图表
- python - Python中是否有导入静态等效项?
- flutter - 堆栈中的 ListView.builder 无法正常工作