r - 用 R 对数据帧中的缺失数据进行插值
问题描述
我有一个类似于下面的数据框:
Country Ccode Year Happiness Power
1 France FR 2000 1000 1000
2 France FR 2001 NA NA
3 France FR 2002 NA NA
4 France FR 2003 1600 2200
5 France FR 2004 NA NA
6 UK UK 2000 1000 1000
7 UK UK 2001 NA NA
8 UK UK 2002 1000 1000
9 UK UK 2003 1000 1000
10 UK UK 2004 1000 1000
我以前使用以下代码来获得差异:
df <- df %>%
arrange(country, year) %>% #sort data
group_by(country) %>%
mutate_if(is.numeric, funs(d = . - lag(.)))
我想通过计算 和 的数据点之间的差异来扩展此代码Happiness
,Power
将其除以数据点之间的年差并计算替换 NA 的值,从而得到以下输出。
Country Ccode Year Happiness Power
1 France FR 2000 1000 1000
2 France FR 2001 1200 1400
3 France FR 2002 1400 1800
4 France FR 2003 1600 2200
5 France FR 2004 NA NA
6 UK UK 2000 1000 1000
7 UK UK 2001 0 0
8 UK UK 2002 1000 1000
9 UK UK 2003 1000 1000
10 UK UK 2004 1000 1000
执行此任务的有效方法是什么?
编辑:请注意,France 2004
也是NA
. 扩展功能似乎确实可以正确处理这种情况。
编辑 2:添加 group_by(country) 似乎因为未知原因搞砸了:似乎代码正在尝试将 a 转换character
为 a numeric
,尽管我不太明白为什么。当我将该列转换为 时character
,该错误变为评估错误。有什么建议么?
> TRcomplete<-TRcomplete%>%
+ group_by(country) %>%
+ mutate_at(70:73,~na.fill(.x,"extend"))
Error in mutate_impl(.data, dots) :
Column `F116.s` can't be converted from character to numeric
> TRcomplete$F116.s <- as.numeric(TRcomplete$F116.s)
> TRcomplete<-TRcomplete%>%
+ group_by(country) %>%
+ mutate_at(70:73,~na.fill(.x,"extend"))
Error in mutate_impl(.data, dots) :
Column `F116.s` can't be converted from character to numeric
> TRcomplete$F116.s <- as.numeric(as.character(TRcomplete$F116.s))
> TRcomplete<-TRcomplete%>%
+ group_by(country) %>%
+ mutate_at(70:73,~na.fill(.x,"extend"))
Error in mutate_impl(.data, dots) :
Column `F116.s` can't be converted from character to numeric
> TRcomplete$F116.s <- as.character(TRcomplete$F116.s))
Error: unexpected ')' in "TRcomplete$F116.s <- as.character(TRcomplete$F116.s))"
> TRcomplete$F116.s <- as.character(TRcomplete$F116.s)
> str(TRcomplete$F116.s)
chr [1:6984] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ...
> TRcomplete<-TRcomplete%>%
+ group_by(country) %>%
+ mutate_at(70:73,~na.fill(.x,"extend"))
Error in mutate_impl(.data, dots) :
Evaluation error: need at least two non-NA values to interpolate.
解决方案
您可以从库 中na.fill
使用fill="extend"
zoo
rapply(df, zoo::na.fill,"integer",fill="extend",how="replace")
Country Ccode Year Happiness Power
1 France FR 2000 1000 1000
2 France FR 2001 1200 1400
3 France FR 2003 1400 1800
4 France FR 2004 1600 2200
5 UK UK 2000 1000 1000
6 UK UK 2001 1000 1000
7 UK UK 2003 1000 1000
8 UK UK 2004 1000 1000
编辑:
library(tidyverse)
library(zoo)
df%>%
group_by(Country)%>%
mutate_at(4:5,~na.fill(.x,"extend"))
Country Ccode Year Happiness Power
1 France FR 2000 1000 1000
2 France FR 2001 1200 1400
3 France FR 2003 1400 1800
4 France FR 2004 1600 2200
5 UK UK 2000 1000 1000
6 UK UK 2001 1000 1000
7 UK UK 2003 1000 1000
8 UK UK 2004 1000 1000
如果组中的所有元素都是NA
:
df%>%
group_by(Country)%>%
mutate_if(is.numeric,~if(all(is.na(.x))) NA else na.fill(.x,"extend"))
推荐阅读
- c++ - 在多个基类之间重载成员函数
- delphi - 如何将图标添加到标准集合编辑器?
- ios - 如何在 Xcode 9 中使用 iOS 6 SDK 构建?
- angular - 使用 Angular-CLI 6.1.2 创建新项目时出错
- performance - 什么度量描述了所用时间与结果中的错误之间的关系?
- ios - 调整字体大小以适应标签高度 - Swift 4、Xcode 9
- r - R 是一个 1 行矩阵或 1 列矩阵的向量
- python - 多类别 one-hot 编码到数据透视表
- python - 循环遍历数据框的行并检查重复项
- javascript - Mongoose - 在集合中找不到任何内容时进行分组、计数并返回 0