首页 > 解决方案 > 根据R中特定字符串的总和创建新列

问题描述

我有一个数据框,其中变量“id_number”代表一个特定的人,其他五个变量代表每个人必须完成的任务。

id_number personal_value_statement career_inventory resume_cover linkedin    personal_budget
      <int> <chr>                    <chr>            <chr>        <chr>       <chr>          
1      1438 in progress              not started      completed    completed   in progress    
2      7362 in progress              not started      not started  completed   completed      
3      3239 in progress              not started      completed    in progress not started    
4      1285 in progress              in progress      in progress  not started not started    
5      8945 not started              not started      not started  not started not started    
6      9246 in progress              not started      not started  completed   not started 
structure(list(id_number = c(1438L, 7362L, 3239L, 1285L, 8945L, 
9246L), personal_value_statement = c("in progress", "in progress", 
"in progress", "in progress", "not started", "in progress"), 
    career_inventory = c("not started", "not started", "not started", 
    "in progress", "not started", "not started"), resume_cover = c("completed", 
    "not started", "completed", "in progress", "not started", 
    "not started"), linkedin = c("completed", "completed", "in progress", 
    "not started", "not started", "completed"), personal_budget = c("in progress", 
    "completed", "not started", "not started", "not started", 
    "not started")), class = c("rowwise_df", "tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -6L), groups = structure(list(
    .rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

我想创建一个基于计算每个人已完成任务数的新列。我假设我要使用mutate(),但我不确定如何对字符串求和。基本上我正在寻找的是一个列,其中“id_number 1438”的值 = 2,因为他们完成了两个任务(“resume_cover”和“linkedin”),其余的 id_numbers 依此类推。

非常感谢任何和所有帮助。

标签: rdplyr

解决方案


ingrowSums后使用ungroup

library(dplyr)
df1 <-  df1 %>% 
     ungroup %>% 
     mutate(Count = rowSums(across(-id_number) == "completed"))

-输出

df1
# A tibble: 6 × 7
  id_number personal_value_statement career_inventory resume_cover linkedin    personal_budget Count
      <int> <chr>                    <chr>            <chr>        <chr>       <chr>           <dbl>
1      1438 in progress              not started      completed    completed   in progress         2
2      7362 in progress              not started      not started  completed   completed           2
3      3239 in progress              not started      completed    in progress not started         1
4      1285 in progress              in progress      in progress  not started not started         0
5      8945 not started              not started      not started  not started not started         0
6      9246 in progress              not started      not started  completed   not started         1

推荐阅读