r - Change scale in geom_qq
问题描述
I'd like to get the numeric values of a variable (rather than z-score) in the x-axis using ggplot and geom_qq
library("ggplot2")
coin_prob <- 0.5 # this is a fair coin
tosses_per_test <- 5000 # we want to flip a coin 5000 times
no_of_tests <- 1000
outcomes <- rbinom(n = no_of_tests,
size = tosses_per_test,
prob = coin_prob)/tosses_per_test
outcomes.df <- data.frame("results"= outcomes)
ggplot(outcomes.df, aes(sample = results)) +
geom_qq() +
geom_qq_line(color="red") +
labs(x="Theoretical Data", title = "Simulated Coin toss", subtitle = "5000 tosses repeated 1000 times", y="Sample Outcomes")
The default in ggplot for the x-axis seems to be z-scores rather than raw theoretical values. I can hack around like this to get the "real" x axis
p <- ggplot(outcomes.df, aes(sample = results)) + geom_qq()
g <- ggplot_build(p)
raw_qs <- g$data[[1]]$theoretical*sd(outcomes.df$results) + mean(outcomes.df$results)
ggplot(outcomes.df, aes(sample = results)) +
geom_qq() +
geom_qq_line(color="red") +
labs(x="Theoretical Data", title = "Simulated Coin toss", subtitle = "5000 tosses repeated 1000 times", y="Sample Outcomes") +
scale_x_continuous(breaks=seq(-3,3,1), labels = round((seq(-3,3,1)*sd(outcomes.df$results) + mean(outcomes.df$results)),2))
But there's got to be something simpler
解决方案
Set the parameters of the distribution such that the theoretical quantiles match the distribution to which you're comparing.
library("ggplot2")
coin_prob <- 0.5 # this is a fair coin
tosses_per_test <- 5000 # we want to flip a coin 5000 times
no_of_tests <- 1000
outcomes <- rbinom(
n = no_of_tests,
size = tosses_per_test,
prob = coin_prob) / tosses_per_test
## set dparams in _qq calls
## so that we're not comparing against standard normal distn.
ggplot(mapping = aes(sample = outcomes)) +
geom_qq(dparams = list(mean = mean(outcomes), sd = sd(outcomes))) +
geom_qq_line(
dparams = list(mean = mean(outcomes), sd = sd(outcomes)),
color = "red"
) +
labs(
x = "Theoretical Data",
title = "Simulated Coin toss",
subtitle = "5000 tosses repeated 1000 times",
y = "Sample Outcomes"
)
You can also change the distribution entirely. For example, to compare against uniform quantiles (eg, p-values)
pvals <- replicate(1000, cor.test(rnorm(100), rnorm(100))$p.value)
ggplot(mapping = aes(sample = pvals)) +
geom_qq(distribution = stats::qunif) +
geom_qq_line(
distribution = stats::qunif,
color = "red"
) +
labs(
x = "Uniform quantiles",
title = "p-values under the null",
subtitle = "1,000 null correlation tests",
y = "Observed p-value"
)
推荐阅读
- python - 根据内容重命名excel文件
- python - 使用python从备用行中提取数据
- python - 用于分析的 Snakemake 规则,其中为 diff 参数生成单个结果文件,并且参数来自另一个规则输出内容
- php - 尝试在 null 上读取属性“用户名”
- django - django 表单总是无效
- websocket - 2 个客户的单个 Webscoket
- android - 定期展示插页式广告
- c++ - 为 CFAPI 设置状态图标无法按预期工作
- python - Python-在给定距离的特定范围内创建网格位置列表
- java - 如何在 OpenJDK 8 中使用 JFX/OpenFX