首页 > 解决方案 > 在 R 中,按句号拆分文本列并按行映射到相同的 Id

问题描述

我有带有 ID 和文本的数据框 2 列。文本必须由句号分割并映射到相同的 ID。

前任

|ID. |Text  |
|112 |india is highly populated. Delhi is capital of india 
|113  |Tiger is wild animal.lt lives in forest
| 114 | sky is high

答案应该是

|ID  | Text |
| 112| india is highly populated|
|112 |Delhi is capital of India |
|113 | Tiger is wild animal |
|113 |  It lives in forest |
| 114| sky is high

你能告诉我如何进入R吗?提前谢谢

标签: rtextsplitaggregate

解决方案


我们可以用separate_rows

library(tidyr)
separate_rows(df1, 'Text', sep="\\.\\s*")

-输出

# A tibble: 4 x 2
#     ID Text                     
#  <dbl> <chr>                    
#1   112 india is highly populated
#2   112 Delhi is capital of india
#3   113 Tiger is wild animal     
#4   113 lt livess in forest      

数据

df1 <- structure(list(ID = c(112, 113), Text = c("india is highly populated. Delhi is capital of india", 
"Tiger is wild animal.lt livess in forest")), class = "data.frame", row.names = c(NA, 
-2L))

推荐阅读