首页 > 解决方案 > 通过缺少行的数据表将字符月份转换为R中的整数

问题描述

我有一列月份,它在 data.table 中作为 char 类型导入,用于即时 "January" "March" 等。在此列中还包含一些缺失的数据NA

我正在使用以下代码将其转换为整数月份:

dt <- dt[!is.na(month), month := match(month, month.abb)]

我在控制台中收到警告:

Warning message:
In `[.data.table`dt, !is.na(month), `:=`(month,  :
  Coerced integer RHS to character to match the type of the target column (column 9 named 'month'). If the target column's type character is correct, it's best for efficiency to avoid the coercion and create the RHS as type character. To achieve that consider R's type postfix: typeof(0L) vs typeof(0), and typeof(NA) vs typeof(NA_integer_) vs typeof(NA_real_). You can wrap the RHS with as.character() to avoid this warning, but that will still perform the coercion. If the target column's type is not correct, it's best to revisit where the DT was created and fix the column type there; e.g., by using colClasses= in fread(). Otherwise, you can change the column type now by plonking a new column (of the desired type) over the top of it; e.g. DT[, `month`:=as.integer(`month`)]. If the RHS of := has nrow(DT) elements then the assignment is called a column plonk and is the way to change a column's type. Column types can be observed with sapply(DT,typeof).

此外,月份列的值变为NA。任何想法?非常感谢您。

桌子看起来像

month    |year  |
September| 1987 |
March    | 1999 |

期望更改为:

month    |year  |
  9      | 1987 |
  3      | 1999 |

最后的变化和工作:

dt[!is.na(month), month := match(month, month.name)]

标签: rdata.table

解决方案


删除 is.na 应该通过将整个列转换为整数类来工作,并且匹配在未找到时返回 NA:

dt[, month := match(month, month.abb)]

我认为通过引用更新时的行话是笨拙的


推荐阅读