r - R 函数/循环计算但返回不需要的结果,可疑的语法/类/子集问题
问题描述
我知道有更简单的方法来完成按因子计算平均值的时间(例如,tapply/table),但我渴望学习循环和语法,以及与类/子集/语法相关的问题
# b) use a loop and conditionals to sum and then divide (long way)
# test calc of total length by species
sum(iris$Petal.Length[iris$Species == "setosa"])
sum(iris$Petal.Length[iris$Species == "versicolor"])
# test calc of count rows of a species
nrow(subset(iris, Species == "setosa"))
nrow(subset(iris, Species == "versicolor" ))
# test calc of mean (long way)
test1 <- sum(iris$Petal.Length[iris$Species == "setosa"]) / nrow(subset(iris, Species == "setosa"))
test1
test2 <- sum(iris$Petal.Length[iris$Species == "versicolor"]) / nrow(subset(iris, Species == "versicolor" ))
test2
# attempt at function, ideally should return the mean by factor, when you enter the Species name
calc_mean_factor <- function() {
# spec_levels <- c(levels(iris$Species)) # levels as a vector, commented to exclude from test calcs
spec_levels <- levels(iris$Species) # obtains the levels of the factors - should this be vector/factor?
x <- length(spec_levels) # creates numerical range cap for the loop
for(i in 1:x){
tot_spec <- sum(iris$Petal.Length[iris$Species] == spec_levels[i]) # is this correct syntax for loop?
count_spec <- nrow(subset(iris, Species == spec_levels[i])) # is this correct syntax for loop?
mean_spec <- tot_spec / count_spec # is this correct syntax for loop?
}
# print tests to check if calculating as expected
print(spec_levels[1:x]) # this shows the correct names
print(spec_levels) # same as above
print(spec_levels[2]) # passes subset test
print(class(spec_levels)) # should this class be 'factor', 'vector', or this is ok?
print(x) # as expected, it is the length of the species of 3
print(class(x)) # returns integer, assume this is ok
print(1:x) # as expected, the range from 1 to 3
print(sum(iris$Petal.Length[iris$Species])) # this function is calculating, but returning total of 205
print(sum(iris$Petal.Length[iris$Species] == spec_levels[1])) # why is the function not accepting the subset? is it due to class?
print(sum(iris$Petal.Length[iris$Species] == "setosa")) # why is this returning 0
print(sum(iris$Petal.Length[iris$Species] == spec_levels[1:x])) # why is this returning 0
print(tot_spec) # expected "0" due to above tests returning 0
print(count_spec[1:x]) # 50 is expected, but why not printing for all three species?
}
calc_mean_factor()
解决方案
在您的 for 循环中,您分配长度为 1 的向量,tot_spec
并且count_spec
3 次为不同的值。首先分配一个空向量,然后通过子集分配值。
tot_spec <- count_spec <- vector()
for(i in 1:x){
tot_spec[i] <- sum(iris$Petal.Length[iris$Species] == spec_levels[i])
count_spec[i] <- nrow(subset(iris, Species == spec_levels[i]))
mean_spec <- tot_spec / count_spec
}
推荐阅读
- python - 使用 ThreadPoolExecutor 跨多个页面抓取
- javascript - 如何将此字符串转换为 javascript 数组?
- java - spring boot 安全配置不正确
- wordpress - 为 wordpress 插件存储 FTP 密码
- python - Python 使用字典或列表作为用户输入
- ionic-framework - 缺少“可观察”类型的以下属性
- python - 如何合并和交错 NumPy 对象数组的列
- arrays - 从 LUA 中的文本文件生成列表/数组
- python - TypeError:列表索引必须是整数或切片,而不是图像处理时的元组错误
- coldfusion - 更新列表以在 stat 和 end 处使用逗号