首页 > 解决方案 > 如何绘制不同的测试结论




    Month District   Age Gender Education Disability Religion                          Occupation JobSeekers GMI
1 2020-01      Dan   U17   Male      None       None   Jewish              Unprofessional workers          2   0
2 2020-01      Dan   U17   Male      None       None  Muslims          Sales and costumer service          1   0
3 2020-01      Dan   U17 Female      None       None    Other                           Undefined          1   0
4 2020-01      Dan 18-24   Male      None       None   Jewish         Production and construction          1   0
5 2020-01      Dan 18-24   Male      None       None   Jewish                     Academic degree          1   0
6 2020-01      Dan 18-24   Male      None       None   Jewish Practical engineers and technicians          1   0
  ACU NACU NewSeekers NewFiredSeekers
1   0    2          0               0
2   0    1          0               0
3   0    1          0               0
4   0    1          0               0
5   0    1          0               0
6   0    1          1               1

我根据相关测试减少了它,例如我所做的 t 检验:

dist.newseek <- Cdata %>% 
  group_by(Month,District) %>% 

  Month   District  NewSeekers
  <chr>   <chr>          <int>
1 2020-01 Dan             6551
2 2020-01 Jerusalem       3589
3 2020-01 North           6154
4 2020-01 Sharon          4131
5 2020-01 South           4469
6 2020-02 Dan             5529

然后进行 t 检验

t.test(NewSeekers ~ District,data=subset(dist.newseek,District %in% c("Dan","South")))

这是我为每个组所做的所有测试(新求职者与地区的 t 测试,年龄与新求职者的 wilcox 和职业与新求职者的 ANONA)我正在寻找一种图形方式来显示每个测试的结果。如果您有任何想法,请帮助

# t test for district vs new seekers

# sorting

dist.newseek <- Cdata %>% 
  group_by(Month,District) %>% 

# performing a t test on the mini table we created

t.test(NewSeekers ~ District,data=subset(dist.newseek,District %in% c("Dan","South")))

# results

Welch Two Sample t-test

data:  NewSeekers by District
t = 0.68883, df = 4.1617, p-value = 0.5274
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
  -119952.3  200737.3
sample estimates:
  mean in group Dan mean in group South 
74608.25            34215.75 

#wilcoxon test 

# filtering Cdata to New seekers based on month and age

age.newseek <- Cdata %>% 
  group_by(Month,Age) %>% 

#performing a wilcoxon test on the subset 

wilcox.test(NewSeekers ~ Age,data=subset(age.newseek,Age %in% c("25-34","45-54")))

# Results

Wilcoxon rank sum exact test

data:  NewSeekers by Age
W = 11, p-value = 0.4857
alternative hypothesis: true location shift is not equal to 0


# Sorting occupation and month by new seekers

occu.newseek <- Cdata %>% 
  group_by(Month,Occupation) %>% 

## Make the Occupation as a factor

occu.newseek$District <- as.factor(occu.newseek$Occupation)

## Get the occupation group means and standart deviations

group.mean.sd <- aggregate(
  x = occu.newseek$NewSeekers, # Specify data column
  by = list(occu.newseek$Occupation), # Specify group indicator
  FUN = function(x) c('mean'=mean(x),'sd'= sd(x))

## Run one way ANOVA test
anova_one_way <- aov(NewSeekers~ Occupation, data = occu.newseek)

## Run the Tukey Test to compare the groups 

## Check the mean differences across the groups 

ggplot(occu.newseek, aes(x = Occupation, y = NewSeekers, fill = Occupation)) +
  geom_boxplot() +
  geom_jitter(shape = 15,
              color = "steelblue",
              position = position_jitter(0.21)) +



标签: rplotstatisticsanova

