r - 使用 LOOP 将单元格中的缺失值替换为上述单元格 (n-1) 中的值
问题描述
我有一个包含数千行的数据文件,其中有我希望用一个值填充的空白。我需要用上面的值替换空单元格。让您了解我的数据是什么样子会更容易,这是一个示例
Variable <- c("AGE","","","","SEX","","SEGMENT","","","","")
Value <- c(1, 2, 3, 4, 1, 2, 1, 2, 3, 4, 5)
Description <- c("18-24","25-34","35-44","45+","Female","Male","A","B","C","D","E")
df <- data.frame(Variable, Value, Description)
> df
Variable Value Description
1 AGE 1 18-24
2 2 25-34
3 3 35-44
4 4 45+
5 SEX 1 Female
6 2 Male
7 SEGMENT 1 A
8 2 B
9 3 C
10 4 D
11 5 E
正如您在上面看到的,第一列有间隙。我需要将这些空单元格替换为上面的相关值,以便新变量在数据框中看起来像这样
> df
Variable Value Description Variable_NEW
1 AGE 1 18-24 AGE
2 2 25-34 AGE
3 3 35-44 AGE
4 4 45+ AGE
5 SEX 1 Female SEX
6 2 Male SEX
7 SEGMENT 1 A SEGMENT
8 2 B SEGMENT
9 3 C SEGMENT
10 4 D SEGMENT
11 5 E SEGMENT
大声思考。我假设要实现这一点,我需要用循环创建一个新变量,然后使用这样的逻辑
IF Variable[n]="" THEN Variable_New[n] = Variable[n-1],
ELSE Variable_New[n] = Variable[n]
我熟悉循环,但不知道如何在具有 lag/n-1 功能的 R 中编写这种东西。可能有很多方法可以实现这一点,但最好使用循环。任何帮助将不胜感激。谢谢
解决方案
这里有一个循环方法:
#Data
Variable <- c("AGE","","","","SEX","","SEGMENT","","","","")
Value <- c(1, 2, 3, 4, 1, 2, 1, 2, 3, 4, 5)
Description <- c("18-24","25-34","35-44","45+","Female","Male","A","B","C","D","E")
df <- data.frame(Variable, Value, Description,stringsAsFactors = F)
#Create new column
df$NewVar <- df$Variable
#Loop
for(i in 2:dim(df)[1])
{
df$NewVar[i] <- ifelse(df$NewVar[i]=="",df$NewVar[i-1],df$NewVar[i])
}
输出:
Variable Value Description NewVar
1 AGE 1 18-24 AGE
2 2 25-34 AGE
3 3 35-44 AGE
4 4 45+ AGE
5 SEX 1 Female SEX
6 2 Male SEX
7 SEGMENT 1 A SEGMENT
8 2 B SEGMENT
9 3 C SEGMENT
10 4 D SEGMENT
11 5 E SEGMENT