首页 > 解决方案 > 如果不是 NA,则通过粘贴第一行重命名列

问题描述

我希望在我的组织中标准化来自 Survey Monkey 的清洁导出,如果第一行不是 NA,我希望将列名重命名为(列名 + 第一行名)。

编辑:理想情况下,这将在函数/循环中实现,以便它可以处理不同大小的数据帧,而无需编辑任何参数。

代表:

df <- tribble(
  ~`Which of these choices do you like`, ~`...1`, ~`...2`, ~`...3`, ~`Respondent ID`, ~`Different Text`, ~`...4`,
  'Fruit', 'Drink', 'Dessert', 'Snack', NA, 'Pizza Topping', 'Pizza Style',
  'Apple', 'Water', 'Pie', 'Oreos', 1234, 'Mushroom', 'Deep Dish',
  'Apple', 'Coffee', 'Cake', 'Granola', 1235, 'Onion', 'NY Style',
  'Banana', 'Coffee', 'Pie', 'Oreos', 1236, 'Mushroom', 'NY Style',
  'Pear', 'Vodka', 'Pie', 'Granola', 1237, 'Onion', 'Deep Dish'
)

列重命名后,我会删除第一行并继续我的生活。

理想情况下,我的 df 看起来像这样:

在此处输入图像描述

感谢您的任何指导!

标签: rdplyrrenamesurveymonkey

解决方案


base R中,我们可以使用paste然后删除第一行

names(df)[1:4] <- paste0(names(df)[1], unlist(df[1, 1:4]))
df <- df[-1, ]

或使用sprintf

names(df)[1:4] <- sprintf("%s (%s)", names(df)[1], unlist(df[1, 1:4]))
df <- df[-1,]

如果我们想通过检查 NA 元素来做到这一点

library(dplyr)
library(tidyr)
library(purrr)
library(stringr)
keydat <- df %>%
          slice(1) %>% 
          select_if(negate(is.na)) %>%
          pivot_longer(everything()) %>%
          group_by(grp = cumsum(!startsWith(name, "..."))) %>% 
          mutate(value = sprintf("%s (%s)", first(name), value)) %>% 
          ungroup %>% 
          select(-grp)


df <- df %>%
        rename_at(vars(keydat$name), ~ keydat$value) %>%
        slice(-1)

df
# A tibble: 4 x 7
#  `Which of these… `Which of these… `Which of these… `Which of these… `Respondent ID`
#  <chr>            <chr>            <chr>            <chr>                      <dbl>
#1 Apple            Water            Pie              Oreos                       1234
#2 Apple            Coffee           Cake             Granola                     1235
#3 Banana           Coffee           Pie              Oreos                       1236
#4 Pear             Vodka            Pie              Granola                     1237
# … with 2 more variables: `Different Text (Pizza Topping)` <chr>, `Different Text (Pizza
#   Style)` <chr>

names(df)
#[1] "Which of these choices do you like (Fruit)"   "Which of these choices do you like (Drink)"  
#[3] "Which of these choices do you like (Dessert)" "Which of these choices do you like (Snack)"  
#[5] "Respondent ID"                                "Different Text (Pizza Topping)"              
#[7] "Different Text (Pizza Style)"      

推荐阅读