首页 > 解决方案 > 基于特定列从df中删除0值的基本功能

问题描述

我正在尝试创建一个非常基本的函数,我想根据 df 中的特定列从 df 中删除任何 0 值(或更少)。当我在函数之外运行这些行时,它们可以工作,但是当我尝试在函数中运行它们时,我收到此错误“ $<-.data.frame( *tmp*, name, value = numeric(0)) 中的错误:替换有 0 行”。有谁知道问题是什么?

Remove_Missing=function(x,name){
  x$name=as.numeric(x$name)
  x=x[x$name>0,]
}

编辑:
示例代码:

#First two lines work but those same two lines won't work if function is called
merged_data$name=as.numeric(merged_data$HETENURE)
merged_data=merged_data[merged_data$HETENURE>0,]
Remove_Missing(merged_data, HETENURE) #Call function

数据

structure(list(HRHHID = c("008906910993941", "008906910993941", 
"648061954059610", "160916068405549", "160916068405549", "168069009100998"
), HRYEAR4 = c("2010", "2010", "2010", "2010", "2010", "2010"
), HETENURE = c(" 1", " 1", " 3", " 1", " 1", " 1"), HEFAMINC = c("11", 
"11", "10", "13", "13", "14"), HRNUMHOU = c(" 2", " 2", " 1", 
" 2", " 2", " 3"), GESTFIPS = c("01", "01", "01", "01", "01", 
"01"), GTMETSTA = c("2", "2", "1", "1", "1", "1"), PEMARITL = c(" 1", 
" 1", " 4", " 1", " 1", " 1"), PESEX = c(" 2", " 1", " 1", " 2", 
" 1", " 2"), PEEDUCA = c("40", "45", "40", "42", "41", "39"), 
    PTDTRACE = c(" 1", " 1", " 1", " 1", " 1", " 1"), PEHSPNON = c(" 2", 
    " 2", " 2", " 2", " 2", " 2"), PEMLR = c(" 5", " 5", " 5", 
    " 1", " 1", " 7"), PRFTLF = c("-1", "-1", "-1", " 1", " 1", 
    "-1"), PRHRUSL = c("-1", "-1", "-1", " 4", " 4", "-1"), HESP1 = c("-1", 
    "-1", "-1", "-1", "-1", "-1"), HESP6 = c("-1", "-1", "-1", 
    "-1", "-1", "-1"), HESP7A = c("-1", "-1", "-1", "-1", "-1", 
    "-1"), HESP8 = c("-1", "-1", "-1", "-1", "-1", "-1"), HRFS12M1 = c(" 1", " 1", " 1", " 1", " 1", " 1")), row.names = c(9L, 10L, 11L, 
12L, 13L, 15L), class = "data.frame")

标签: rfunction

解决方案


这里有两个问题和一个启用错误:

  1. 您使用 定义您的函数,function(x, name)然后尝试将特定列引用为x$name,这应该会失败。也就是说,如果name应该识别(通过标准评估)一列,那么它实际上应该是一个字符串,并且$ 不能以这种方式工作。您应该改为使用x[[name]](请参阅括号 [ ] 和双括号 [[ ]] 之间的区别来访问列表或数据框的元素)。

    但是,由于接下来的两个错误,这并没有报告为问题(尽管它应该)。

  2. 您将您的功能称为

    Remove_Missing(merged_data, HETENURE)
    

    但是由于您没有尝试进行非标准评估(NSE),因此使用HETENURE是错误的。应该发生的是,在你的函数中,当它name被引用时,它应该寻找一个名为HETENURE而不是找到它的对象;它应该与Error: object 'HETENURE' not found. 我认为你应该做的是

    Remove_Missing(merged_data, "HETENURE")
    
  3. 与其说是一个错误,不如说是一个允许其他错误未被发现的弱点:你分配merged_data$name <- as.numeric(...)了 ,所以在你的函数中x$name 应该被引用x$HETENURE并且应该失败时,它反而找到了一个在你的数据中命名的列name(因此函数的传递参数name从未被引用/使用)。

首先,让我们删除名为 的列的诱人隐藏错误name

merged_data$name <- NULL

二、固定功能:

Remove_Missing = function(x, name) {
  x[[name]] = as.numeric(x[[name]])
  x[x[[name]] > 0,]
}

三、修复调用并获取返回数据:

Remove_Missing(merged_data, "HETENURE")
#             HRHHID HRYEAR4 HETENURE HEFAMINC HRNUMHOU GESTFIPS GTMETSTA PEMARITL PESEX PEEDUCA PTDTRACE PEHSPNON PEMLR PRFTLF PRHRUSL HESP1 HESP6 HESP7A HESP8 HRFS12M1 name
# 9  008906910993941    2010        1       11        2       01        2        1     2      40        1        2     5     -1      -1    -1    -1     -1    -1        1    1
# 10 008906910993941    2010        1       11        2       01        2        1     1      45        1        2     5     -1      -1    -1    -1     -1    -1        1    1
# 11 648061954059610    2010        3       10        1       01        1        4     1      40        1        2     5     -1      -1    -1    -1     -1    -1        1    3
# 12 160916068405549    2010        1       13        2       01        1        1     2      42        1        2     1      1       4    -1    -1     -1    -1        1    1
# 13 160916068405549    2010        1       13        2       01        1        1     1      41        1        2     1      1       4    -1    -1     -1    -1        1    1
# 15 168069009100998    2010        1       14        3       01        1        1     2      39        1        2     7     -1      -1    -1    -1     -1    -1        1    1

当然,在这种情况下,没有任何内容被过滤掉(因为您的所有数据都通过了条件),所以如果我暂时修改函数以改为条件> 1,我们将看到变化:

Remove_Missing = function(x, name) {
  x[[name]] = as.numeric(x[[name]])
  x[x[[name]] > 1,]
}
Remove_Missing(merged_data, "HETENURE")
#             HRHHID HRYEAR4 HETENURE HEFAMINC HRNUMHOU GESTFIPS GTMETSTA PEMARITL PESEX PEEDUCA PTDTRACE PEHSPNON PEMLR PRFTLF PRHRUSL HESP1 HESP6 HESP7A HESP8 HRFS12M1 name
# 11 648061954059610    2010        3       10        1       01        1        4     1      40        1        2     5     -1      -1    -1    -1     -1    -1        1    3

推荐阅读