首页 > 解决方案 > 棒球运动员的薪水和联赛

问题描述

我正在学习 R,我正在使用 Lahman 数据集来确定球员的薪水是否会影响他们或他们的球队的比赛方式。当我深入研究数据时,我很想知道一个球员的薪水是否会因他参加的联赛(AL 或 NL)而有所不同。我写了这个程序来看看它们是否依赖......我很惊讶地发现它们是。这就是我要回答这个问题的方式吗?

library(tidyverse)
library(Lahman)

#Brings salary information together with batting
bat_salaries <- left_join(Batting,Salaries, suffix = c(".x", ".y"))
bat_salaries <- left_join(bat_salaries, Teams, by = c("yearID", "teamID", "lgID"), suffix = c("_individual", "_team"))

#I noticed the tail of bat_salaries$salary is very heavy after the 3rd IQR - I cut it off to only look at
#data before the 3rd IQR

bat_salaries_iqr3 <- bat_salaries %>%
  filter(salary < 2350000 & salary > 0)

bat_salaries_chi <- bat_salaries_iqr3 %>%
  select(salary) %>%
  mutate(leagID = ifelse(bat_salaries_iqr3$lgID == "NL", 1, 0))

chisq.test(table(bat_salaries_chi), correct = FALSE)


Pearson's Chi-squared test

data:  table(bat_salaries_chi)
X-squared = 2462.6, df = 2139, p-value = 1.13e-06

标签: rstatistics

解决方案


推荐阅读