首页 > 解决方案 > R:循环时数据集变为空?

问题描述

我有一组来自 openpowerlifting.org 的运动员记录,我想检索某个部门的所有运动员。条目的形式为“Meet ID Name Sex Equipment Age Divison ...”,我希望提取所有参加过某个部门的人。这是我的代码:

powerlift <- read.csv("openpowerlifting.csv",header = TRUE,fill = TRUE,stringsAsFactors = FALSE )

n = length(powerlift$TotalKg)

UPA_Open = as.data.frame(matrix(c(rep(0,n*17)),ncol=17))
j=1

for(i in 1:n){
    if(powerlift$Divison[i]=="UPA Open"){
        UPA_Open[j,] = powerlift[i,]
        j = j + 1
    }
 }

我遇到以下问题:

Error in if (powerlift$Divison[i] == "UPA Open") { : 
  argument is of length zero

并在执行后调查数据集

> i
[1] 1
> powerlift$Division[i]
[1] "Mst 45-49"
> powerlift$Division[i] == "Mst 45-49"
[1] TRUE

因此它在尝试一次迭代后停止,声称数据为空,而事实并非如此。到底是怎么回事?

标签: r

解决方案


试图避免XY 问题并考虑到您“想要从某个部门检索所有运动员”,这里是您的问题的替代方案:

# Simulating your data
Division <- c("UPA Open", "DEF", "GHI", "UPA Open", "UPA Open")
someColumn <- c("athlete1", "athlete2", "athlete3", "athlete4" , "athlete5")
otherColumn <- c(11, 22, 33, 44, 55)
powerlift <- data.frame(someColumn, otherColumn, Division)
print(powerlift)

# The actual solution
UPA_Open <- powerlift[powerlift$Division == "UPA Open", ]
print(UPA_Open)

解释:

# Explanation line by line
pos <- powerlift$Division == "UPA Open" # variable pos now contains a vector of TRUE OR FALSE, indicating the lines which Division are equals to "UPA OPEN"
print(pos) # verify the content of pos variable
UPA_Open <- powerlift[pos, ] # Selecting only the lines of the powerlift data.frame which pos is TRUE. powerlift[<<lines>>, <<columns>>].
print(UPA_Open) # print the results

希望能帮助到你!:)


推荐阅读