首页 > 解决方案 > 如何在R中进行字符串拆分?

问题描述

我有一个这样的数据框:

Screen.name     party                             users
1  A_Gloeckner   SPD                          @MartinSchulz. 
2  A_Gloeckner   SPD                           @MartinSchulz 
3 A_Gloeckner   SPD  @ManuelaSchwesig @sigmargabriel @nahles
4  a_grotheer   SPD                           @SouthendRNLI 
5  a_grotheer   SPD                           @ribasdiego10 
6  a_grotheer   SPD                        @HBBuergerschaft 
7  a_grotheer   SPD                             @UniBremen… 

我想拆分第 3 列并使数据框看起来像这样:

Screen.name party                          mentioned_users
1  A_Gloeckner   SPD                          @MartinSchulz. 
2  A_Gloeckner   SPD                           @MartinSchulz 
3  A_Gloeckner   SPD                        @ManuelaSchwesig 
4 A_Gloeckner   SPD                          @sigmargabriel 
5 A_Gloeckner   SPD                             @nahles
6  a_grotheer   SPD                           @SouthendRNLI 
7  a_grotheer   SPD                           @ribasdiego10 
8  a_grotheer   SPD                        @HBBuergerschaft 
9 a_grotheer   SPD                             @UniBremen… 

到目前为止我已经尝试过这个:mention_polits_2017=mention_polits_2017[,list(mention_polits_2017=unlist(strsplit(mention_polits_2017,","))),by=mention_polits_2017$Screen.name]

但它向我显示了一个错误,“ [.data.frame(mention_polits_2017, , list(mention_polits_2017 = unlist(strsplit(mention_polits_2017, : 未使用的参数 (by =mention_polits_2017$Screen.name)) 中的错误”

谢谢你。

标签: rstringstrsplit

解决方案


你可以试试

library(tidyverse)
df %>% 
 separate_rows(users, sep=" ")
  Screen.name party            users
1 A_Gloeckner   SPD   @MartinSchulz.
2 A_Gloeckner   SPD    @MartinSchulz
3 A_Gloeckner   SPD @ManuelaSchwesig
4 A_Gloeckner   SPD   @sigmargabriel
5 A_Gloeckner   SPD          @nahles
6  a_grotheer   SPD    @SouthendRNLI
7  a_grotheer   SPD    @ribasdiego10
8  a_grotheer   SPD @HBBuergerschaft
9  a_grotheer   SPD       @UniBremen

数据

df <- structure(list(Screen.name = structure(c(1L, 1L, 1L, 2L, 2L, 
                                               2L, 2L), .Label = c("A_Gloeckner", "a_grotheer"), class = "factor"), 
                     party = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "SPD", class = "factor"), 
                     users = c("@MartinSchulz.", "@MartinSchulz", "@ManuelaSchwesig @sigmargabriel @nahles", 
                               "@SouthendRNLI", "@ribasdiego10", "@HBBuergerschaft", "@UniBremen"
                     )), class = "data.frame", .Names = c("Screen.name", "party", 
                                                          "users"), row.names = c(NA, -7L))

推荐阅读