首页 > 解决方案 > R 函数/循环计算但返回不需要的结果,可疑的语法/类/子集问题

问题描述

我知道有更简单的方法来完成按因子计算平均值的时间(例如,tapply/table),但我渴望学习循环和语法,以及与类/子集/语法相关的问题


# b) use a loop and conditionals to sum and then divide (long way)
       # test calc of total length by species
       sum(iris$Petal.Length[iris$Species == "setosa"])
     sum(iris$Petal.Length[iris$Species == "versicolor"])

       # test calc of count rows of a species
       nrow(subset(iris, Species == "setosa"))
     nrow(subset(iris, Species == "versicolor" ))

       # test calc of mean (long way)
       test1 <- sum(iris$Petal.Length[iris$Species == "setosa"]) / nrow(subset(iris, Species ==      "setosa"))
       test1

       test2 <- sum(iris$Petal.Length[iris$Species == "versicolor"]) / nrow(subset(iris, Species == "versicolor" ))
       test2


           # attempt at function, ideally should return the mean by factor, when you enter the Species name
               calc_mean_factor <- function() {
               # spec_levels <- c(levels(iris$Species)) # levels as a vector, commented to exclude from test calcs
                   spec_levels <- levels(iris$Species) # obtains the levels of the factors - should this be vector/factor?
                   x <- length(spec_levels) # creates numerical range cap for the loop
                       for(i in 1:x){
                           tot_spec <- sum(iris$Petal.Length[iris$Species] == spec_levels[i]) # is this correct syntax for loop?
                           count_spec <- nrow(subset(iris, Species == spec_levels[i])) # is this correct syntax for loop?
                           mean_spec <- tot_spec / count_spec # is this correct syntax for loop?
                       }

                   # print tests to check if calculating as expected
                   print(spec_levels[1:x]) # this shows the correct names
                   print(spec_levels) # same as above
                   print(spec_levels[2]) # passes subset test
                   print(class(spec_levels)) # should this class be 'factor', 'vector', or this is ok?
                   print(x) # as expected, it is the length of the species of 3
                   print(class(x)) # returns integer, assume this is ok
                   print(1:x) # as expected, the range from 1 to 3
                   print(sum(iris$Petal.Length[iris$Species])) # this function is calculating, but returning total of 205
                   print(sum(iris$Petal.Length[iris$Species] == spec_levels[1])) # why is the function not accepting the subset? is it due to class?
                   print(sum(iris$Petal.Length[iris$Species] == "setosa")) # why is this returning 0
                   print(sum(iris$Petal.Length[iris$Species] == spec_levels[1:x])) # why is this returning 0
                   print(tot_spec) # expected "0" due to above tests returning 0
                   print(count_spec[1:x]) # 50 is expected, but why not printing for all three species?
                   }

               calc_mean_factor()

标签: rfunctionloops

解决方案


在您的 for 循环中,您分配长度为 1 的向量,tot_spec并且count_spec3 次为不同的值。首先分配一个空向量,然后通过子集分配值。

tot_spec <- count_spec <- vector()
for(i in 1:x){
  tot_spec[i] <- sum(iris$Petal.Length[iris$Species] == spec_levels[i]) 
  count_spec[i] <- nrow(subset(iris, Species == spec_levels[i])) 
  mean_spec <- tot_spec / count_spec 
              }

推荐阅读