首页 > 解决方案 > 如何通过在 R 中分离年份列来重建我的数据集

问题描述

我有这个数据集

GDP 列与年份的 GDP 增长率

但我想有一个这样的数据集使用 R

单独列中的年份 GDP 增长率

标签: rdplyrbase

解决方案


我们可以pivot_longer为此使用

library(tidyr)
library(dplyr)
pivot_longer(df1, cols = starts_with("GDP"), names_to = c(".value", "Year"),
   names_pattern = "([^\\d]+)(\\d+)") %>%
   rename(`Growth rate` = GDP_GR)

-输出

# A tibble: 4 × 4
  `Country Name` `Country Code` Year  `Growth rate`
  <chr>          <chr>          <chr>         <dbl>
1 Afghanistan    AFG            2011             NA
2 Afghanistan    AFG            2012     1234143668
3 Albania        ALB            2011     2703864872
4 Albania        ALB            2012    -4429023858

或无rename

pivot_longer(df1, cols = starts_with("GDP"), names_to = "Year", 
     values_to = "Growth rate", names_pattern = "\\D+(\\d+)")
# A tibble: 4 × 4
  `Country Name` `Country Code` Year  `Growth rate`
  <chr>          <chr>          <chr>         <dbl>
1 Afghanistan    AFG            2011             NA
2 Afghanistan    AFG            2012     1234143668
3 Albania        ALB            2011     2703864872
4 Albania        ALB            2012    -4429023858

数据

df1 <- structure(list(`Country Name` = c("Afghanistan", "Albania"), 
    `Country Code` = c("AFG", "ALB"), GDP_GR2011 = c(NA, 2703864872
    ), GDP_GR2012 = c(1234143668, -4429023858)), 
class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -2L))

推荐阅读