首页 > 解决方案 > 将 `pivot_wider` 限制为匹配模式的行

问题描述

我想根据列中的所有值而不是那些与模式匹配的值来更宽地旋转列。

一些玩具数据:

df <- data.frame(utterance = c("A and stuff", 
                               "X and something", 
                               "A and some more", 
                               "B etc.", 
                               "B", 
                               "x yz and so on", 
                               "BBB"),
                     timestamp = c("00:05:31.736 - 00:05:35.263", "00:05:31.829 - 00:05:36.449", 
                                   "00:05:31.829 - 00:05:36.449", "00:05:31.829 - 00:05:36.449", 
                                   "00:05:31.842 - 00:05:35.302", "00:05:35.088 - 00:05:36.134", 
                                   "00:05:35.263 - 00:05:53.052"))

我只想扩大以orutterance开头的那些行。我只能在以下所有行上更宽地旋转:AButterance

library(tidyr)
df %>%
  group_by(timestamp) %>%
  pivot_wider(-utterance, 
              names_from = utterance, 
              values_from = utterance)

# A tibble: 5 x 8
# Groups:   timestamp [5]
  timestamp                   `A and stuff` `X and something` `A and some more` `B etc.` B     `x yz and so on` BBB  
  <chr>                       <chr>         <chr>             <chr>             <chr>    <chr> <chr>            <chr>
1 00:05:31.736 - 00:05:35.263 A and stuff   NA                NA                NA       NA    NA               NA   
2 00:05:31.829 - 00:05:36.449 NA            X and something   A and some more   B etc.   NA    NA               NA   
3 00:05:31.842 - 00:05:35.302 NA            NA                NA                NA       B     NA               NA   
4 00:05:35.088 - 00:05:36.134 NA            NA                NA                NA       NA    x yz and so on   NA   
5 00:05:35.263 - 00:05:53.052 NA            NA                NA                NA       NA    NA               BBB

我试图对utterance模式进行子集化,但出现错误:

df %>%
  group_by(timestamp) %>%
  pivot_wider(names_from = utterance[grepl("^(A|B)", utterance)], 
              values_from = utterance[grepl("^(A|B)", utterance)])
Error: object 'utterance' not found

我怎样才能只在匹配的行上进行旋转

预期的:

# timestamp                      `A`              utterance         `B`   
# <chr>                          <chr>            <chr>             <chr> 
#  00:05:31.736 - 00:05:35.263   A and stuff      NA                NA    
#  00:05:31.829 - 00:05:36.449   A and some more  X and something   B etc.
#  00:05:31.842 - 00:05:35.302   NA               NA                B     
#  00:05:35.088 - 00:05:36.134   NA               x yz and so on    NA    
#  00:05:35.263 - 00:05:53.052   NA               NA                BBB

标签: rpattern-matchingtidyr

解决方案


您可以创建一个新names列:

library(stringr)
library(dplyr)
library(tidyr)

df %>% 
  mutate(pvt = case_when(str_detect(utterance, "^A") ~ "A",
                         str_detect(utterance, "^B") ~ "B",
                         TRUE ~ "utterance")) %>% 
  pivot_wider(names_from = pvt,
              values_from = utterance)

这返回

# A tibble: 5 x 4
  timestamp                   A               utterance       B     
  <chr>                       <chr>           <chr>           <chr> 
1 00:05:31.736 - 00:05:35.263 A and stuff     NA              NA    
2 00:05:31.829 - 00:05:36.449 A and some more X and something B etc.
3 00:05:31.842 - 00:05:35.302 NA              NA              B     
4 00:05:35.088 - 00:05:36.134 NA              x yz and so on  NA    
5 00:05:35.263 - 00:05:53.052 NA              NA              BBB  

推荐阅读