首页 > 解决方案 > R Modify dataframes to the same length

问题描述

I've got a list containing multiple dataframes with two columns (Year and area).
The problem is that some dataframes only contain information from 2002-2015 or 2003-2017 and other from 2001-2018 and so one. So they differ in length.

list:

list(structure(list(Year= c(2001,2002,2004,2005), Area=c(1,2,3,4), class ="data.frame"), 
structure(list(Year= c(2001,2004,2018), Area=c(1,2,4), class ="data.frame", 
(list(Year= c(2008,2009,2014,2015,2016), Area=c(1,2,3,4,5), class ="data.frame"))

How can I modify them all to the same length (from 2001-2018) by adding NA or better 0 for area if there is no area information for that year.

标签: rlistdataframe

解决方案


Let

A = data.frame(Year= c(2001,2002,2004,2005), Area=c(1,2,3,4)) 
B = data.frame(Year= c(2001,2004,2018), Area=c(1,2,4)) 
C = list(A, B) 

Then we have

Ref = data.frame(Year = 2001:2018)
New.List = lapply(C, function(x) dplyr::left_join(Ref, x))

with the desired result

[[1]]
   Year Area
1  2001    1
2  2002    2
3  2003   NA
4  2004    3
5  2005    4
6  2006   NA
7  2007   NA
8  2008   NA
9  2009   NA
10 2010   NA
11 2011   NA
12 2012   NA
13 2013   NA
14 2014   NA
15 2015   NA
16 2016   NA
17 2017   NA
18 2018   NA

[[2]]
   Year Area
1  2001    1
2  2002   NA
3  2003   NA
4  2004    2
5  2005   NA
6  2006   NA
7  2007   NA
8  2008   NA
9  2009   NA
10 2010   NA
11 2011   NA
12 2012   NA
13 2013   NA
14 2014   NA
15 2015   NA
16 2016   NA
17 2017   NA
18 2018    4

To make sure that all data.frames in the list share the same spelling of Year, do

lapply(C, function(x) {colnames(x)[1] = "Year"; x})

provided the first column is always the Year-column.


推荐阅读