首页 > 解决方案 > 在列名中具有相同模式的 Rstudio 列中如何变异?

问题描述

我有一个可能很简单的问题,但我仍然没有设法解决它。

我有一个包含多列的data.frame,列名中具有以下结构:NDVI_20180506, NDVI_20180526, NDVI_20180917, NDVI_20180929, NDVI_20181008, NDVI_20181126 ... 每当列名(“”)中出现类似模式时,我想创建一个新列NDVI_201805

例如:(NDVI_May列名)并且该列包含列的平均值NDVI_20180506NDVI_20180526

标签: rdplyr

解决方案


library(dplyr)
library(tidyr)
library(lubridate)

生成一些数据:

dat <- tibble(NDVI_20180506 = rnorm(10),
              NDVI_20180526 = rnorm(10),
              NDVI_20180917 = rnorm(10),
              NDVI_20180929 = rnorm(10),
              NDVI_20181008 = rnorm(10),
              NDVI_20181126 = rnorm(10))

dat %>% 
  pivot_longer(everything()) %>% # Turn to long format to manipulate variable names
  separate(name, c("name", "date"), "_") %>% # Separate date from variable name
  mutate(date = ymd(date), # Set to date
         month = format(date, "%B")) %>% # Extract the month's name
  unite(name, name, month, sep = "_") %>% # Merge variable name with month's name
  group_by(name) %>% 
  summarize(value = mean(value)) %>% # Average by variable
  pivot_wider(names_from = name, values_from = value) # Bring back to wide format

  NDVI_May NDVI_November NDVI_October NDVI_September
     <dbl>         <dbl>        <dbl>          <dbl>
1    0.137         0.258       -0.454         0.0115

推荐阅读