r - 聚合表时出现因子错误但没有列是因子
问题描述
所以我从 csv 文件上传了我的数据。我尝试上传它,stringsAsFactors = FALSE
但我仍然收到错误。前 13 列是字符串,其余列(从 14 列开始)都是数字。下面是核心代码:
library("readxl")
# Read data with facotr is False
data <- read.csv("PFR csvData.csv",stringsAsFactors = FALSE)
# Convert all numeric rows to numeric
data[,14:length(colnames(data))]<- as.numeric(as.character(unlist(data[,14:length(colnames(data))])))
# Convert all string rows to characters
data[,1:13]<- as.character(unlist(data[,1:13]))
当我检查每一列的类时,sapply(data, class)
我得到:
Rk Player Pos Age Date Lg Tm
"character" "character" "character" "character" "character" "character" "character"
H.A Opp Result G. Week Day Receiving_Tgt
"character" "character" "character" "character" "character" "character" "numeric"
Receiving_Rec Receiving_Yds Receiving_Y.R Receiving_TD Receiving_Ctch. Receiving_Y.Tgt Receiving_PPR
"numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
Passing_Cmp Passing_Att Passing_Cmp. Passing_Yds Passing_TD Passing_Int Passing_Rate
"numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
Passing_Sk Passing_Sk_Yds Passing_Y.A Passing_AY.A Passing_PPR Rushing_Att Rushing_Yds
"numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
Rushing_Y.A Rushing_TD Rushing_Half_PPR Total_Half_PPR
"numeric" "numeric" "numeric" "numeric"
我还通过apply(data, 2, function(x) any(is.na(x)))
并获得了 NAs:
Rk Player Pos Age Date Lg Tm
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
H.A Opp Result G. Week Day Receiving_Tgt
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Receiving_Rec Receiving_Yds Receiving_Y.R Receiving_TD Receiving_Ctch. Receiving_Y.Tgt Receiving_PPR
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Passing_Cmp Passing_Att Passing_Cmp. Passing_Yds Passing_TD Passing_Int Passing_Rate
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Passing_Sk Passing_Sk_Yds Passing_Y.A Passing_AY.A Passing_PPR Rushing_Att Rushing_Yds
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Rushing_Y.A Rushing_TD Rushing_Half_PPR Total_Half_PPR
FALSE FALSE FALSE FALSE
所以在这一点上,我认为我上传了没有因素的数据,通过强制它们的类型确保所有列都不是因素,并通过查看每列的类来仔细检查。我还确保没有 NA
然而,当我使用我的聚合函数时,我得到一个与因素有关的错误:
aggregate(data$Player, by = list(data$Total_Half_PPR), FUN = sum)
Error in Summary.factor(291L, na.rm = FALSE) :
‘sum’ not meaningful for factors
我不知道还能做什么。任何帮助表示赞赏!
解决方案
“播放器”是factor
。我们需要转换为numeric
data$Player <- as.numeric(as.character(data$Player))
如果我们需要获取sum
“Total_Half_PPR”,请以另一种方式进行
aggregate(data$Total_Half_PPR, by = list(data$Player), FUN = sum)
或使用公式法
aggregate(Total_Half_PPR ~ Player, data, FUN = sum)
推荐阅读
- api - 即使我的 API 路由正确,邮递员返回的 HTML 也显示无法发布
- swift - 用于检查接收到的泛型类是否属于泛型类型的泛型函数
- python - 将函数应用于不同的对象(Python)
- r - 添加分钟到一天中的时间(hh:mm),如果时间> 24,则从新的一天开始
- webhooks - 我可以通过团队中的传入 webhook 连接器发布消息,但不是卡片,只是纯文本?
- java - 包不存在 - Apache AnT
- python - 如何将图像转换为 URL Discord Py
- python - Python线程传递参数差异
- spring-boot - 如何从使用 Object...uriVariables 的 RestTemplate 对象获取生成的 URL 信息
- javascript - 嵌套在 html 中时,MathJax 不呈现乳胶