首页 > 解决方案 > 用于数据分布的 ggplot 语法

问题描述

我正在尝试绘制 beforeMinWageLaw 和 afterMinWageLaw 变量的数据分布,但是当我将其存储在 df 而不是 seattleData 中时,r 说“错误:美学必须是长度 1 或与数据相同 (43):x ”。我怎样才能解决这个问题?另外,我怎么能做一个正态概率图来获得数据正态性的视图?谢谢。

#Import Data
#seattleData <- read.table(file=file.choose(),
#                          header=T, sep=",",)

library(ggplot2)

#Define Variables
 food_drink_workers <- seattleData$food_drink_workers
 MinWage <- seattleData$washington_state_minwage
 afterMinWageLaw <- food_drink_workers[304:346]
 beforeMinWageLaw <- food_drink_workers[1:303]
 df <- data.frame(seattleData)

#Display Data Distribution with ggplot
 x <-ggplot(df, aes(x=food_drink_workers)) + 
  geom_histogram(mapping = aes(y = ..density..), color="black",     fill="white") +
  geom_density(alpha=.2, fill="blue")
  x + geom_vline(xintercept = c(108.8636), linetype = "dashed", color = "red") + 
    ggtitle("Distribtution of the Data") + xlab("Seattle MSA Food and Drink          Workers") + ylab("Density")

#Conduct Two Sample t-test
 options(scipen = 100)
 tTest <- t.test(beforeMinWageLaw, afterMinWageLaw, mu=0, alternative = "less",
                conf=.95, var.equal = F, paired = F)

您可以在此处下载数据:https ://fred.stlouisfed.org/series/SMU53426607072200001SA

截屏

标签: rggplot2histogramdata-visualizationdistribution

解决方案


您会收到此错误消息“错误:美学必须是长度 1 或与数据 (43) 相同:x”,因为向量afterMinWageLaw的长度为 43 个值且beforeMinWageLaw长度为 303 个值,这就是您无法引用它们的原因在相同的美学范围内aes(),我猜。

我会在一个图中使用不同的可视化,这样您就可以使用不同的数据长度或行数来设置不同的美学。首先,我会将您的数据分成两个数据框,一个用于法律之前,另一个用于法律之后。使用 ggplot,您可以在一个图中引用不同的数据框,在您的情况下,例如:

#set row indicex ranges for before and after law
row_index_range_before <- 1:303;
row_index_range_after <- 304:346;

#define two data frames
df_before <- data.frame(seattleData)[row_index_range_before, ];
df_after <- data.frame(seattleData)[row_index_range_after, ];

#display data distributions of both data frames with ggplot
x <- ggplot() + 
  geom_histogram(
    data = df_before
    ,mapping = aes(
      x = food_drink_workers
      ,y = ..density..
      ,color = "blue")
    ,fill = "white") +
  geom_histogram(
    data = df_after
    ,mapping = aes(
      x = food_drink_workers
      ,y = ..density..
      ,color = "red")
    ,fill = "white") +
  geom_density(
    data = df_before
    ,mapping = aes(
      x = food_drink_workers
      ,y = ..density..
      ,fill = "blue")
    ,alpha = .2) +
  geom_density(
    data = df_after
    ,mapping = aes(
      x = food_drink_workers
      ,y = ..density..
      ,fill = "red")
    ,alpha = .2) +
  scale_colour_manual(
    name = "Color"
    ,values = c("blue" = "blue", "red" = "red")
    ,labels = c("blue" = "Before Law", "red" = "After Law")) +
  scale_fill_manual(
    name = "Fill"
    ,values = c("blue" = "blue", "red" = "red")
    ,labels = c("blue" = "Before Law","red" = "After Law"));

x + geom_vline(
  xintercept = c(108.8636)
  ,linetype = "dashed"
  ,color = "red") + 
ggtitle("Distribtution of the Data") + 
  xlab("Seattle MSA Food and Drink          Workers") + 
  ylab("Density");

但是这样,你也可以引用和afterMinWageLaw删除引用beforeMinWageLaw数据框,我认为。xaes()data

要绘制图例,您需要设置colorfillaes()其中添加scale_colour_manual()或添加scale_fill_manual()到您的情节中。在此处输入图像描述


推荐阅读