首页 > 解决方案 > 使用 LOOP 将单元格中的缺失值替换为上述单元格 (n-1) 中的值

问题描述

我有一个包含数千行的数据文件,其中有我希望用一个值填充的空白。我需要用上面的值替换空单元格。让您了解我的数据是什么样子会更容易,这是一个示例

Variable <- c("AGE","","","","SEX","","SEGMENT","","","","")    
Value <- c(1, 2, 3, 4, 1, 2, 1, 2, 3, 4, 5)
Description <- c("18-24","25-34","35-44","45+","Female","Male","A","B","C","D","E")
df <- data.frame(Variable, Value, Description)

> df
   Variable Value Description
1       AGE     1       18-24
2               2       25-34
3               3       35-44
4               4         45+
5       SEX     1      Female
6               2        Male
7   SEGMENT     1           A
8               2           B
9               3           C
10              4           D
11              5           E

正如您在上面看到的,第一列有间隙。我需要将这些空单元格替换为上面的相关值,以便新变量在数据框中看起来像这样

> df
   Variable Value Description Variable_NEW
1       AGE     1       18-24               AGE
2               2       25-34               AGE
3               3       35-44               AGE
4               4         45+               AGE
5       SEX     1      Female               SEX
6               2        Male               SEX
7   SEGMENT     1           A           SEGMENT
8               2           B           SEGMENT
9               3           C           SEGMENT
10              4           D           SEGMENT
11              5           E           SEGMENT

大声思考。我假设要实现这一点,我需要用循环创建一个新变量,然后使用这样的逻辑

    IF Variable[n]="" THEN Variable_New[n] = Variable[n-1], 
               ELSE Variable_New[n] = Variable[n]

我熟悉循环,但不知道如何在具有 lag/n-1 功能的 R 中编写这种东西。可能有很多方法可以实现这一点,但最好使用循环。任何帮助将不胜感激。谢谢

标签: rloopsmissing-data

解决方案


这里有一个循环方法:

#Data
Variable <- c("AGE","","","","SEX","","SEGMENT","","","","")    
Value <- c(1, 2, 3, 4, 1, 2, 1, 2, 3, 4, 5)
Description <- c("18-24","25-34","35-44","45+","Female","Male","A","B","C","D","E")
df <- data.frame(Variable, Value, Description,stringsAsFactors = F)
#Create new column
df$NewVar <- df$Variable
#Loop
for(i in 2:dim(df)[1])
{
  df$NewVar[i] <- ifelse(df$NewVar[i]=="",df$NewVar[i-1],df$NewVar[i])
}

输出:

   Variable Value Description  NewVar
1       AGE     1       18-24     AGE
2               2       25-34     AGE
3               3       35-44     AGE
4               4         45+     AGE
5       SEX     1      Female     SEX
6               2        Male     SEX
7   SEGMENT     1           A SEGMENT
8               2           B SEGMENT
9               3           C SEGMENT
10              4           D SEGMENT
11              5           E SEGMENT

推荐阅读