首页 > 解决方案 > 将字符串分解为 R 中的各个字段

问题描述

我有字符串

She was the youngest of the two daughters of a most affectionate

我想把它变成一个像下面这样的向量

she was the youngest等等

如果可能的话,我想使用 stringr。

谢谢你。

标签: rnlpstringr

解决方案


以下任何一项都可以工作:

scan(text=charv, what = character())
 [1] "She"          "was"          "the"          "youngest"     "of"           "the"         
 [7] "two"          "daughters"    "of"           "a"            "most"         "affectionate"

或者

unlist(strsplit(charv,' '))

 [1] "She"          "was"          "the"          "youngest"     "of"           "the"         
 [7] "two"          "daughters"    "of"           "a"            "most"         "affectionate"

或者

read.table(text=gsub(' ','\n',charv))
             V1
1           She
2           was
3           the
4      youngest
5            of
6           the
7           two
8     daughters
9            of
10            a
11         most
12 affectionate

或者

 unlist(regmatches(charv,gregexpr('\\w+',charv)))
 [1] "She"          "was"          "the"          "youngest"     "of"           "the"         
 [7] "two"          "daughters"    "of"           "a"            "most"         "affectionate"

在哪里:

 charv<-'She was the youngest of the two daughters of a most affectionate'

编辑:使用 stringr:以下任何一项

library(stringr)
str_extract_all(charv, '\\w+')
str_split(charv," ")

推荐阅读