首页 > 解决方案 > 如何删除R中以冒号结尾的文本模式?

问题描述

我有以下句子

review <- C("1a. How long did it take for you to receive a personalized response to an internet or email inquiry made to THIS dealership?: Approx. It was very prompt however. 2f. Consideration of your time and responsiveness to your requests.: Were a little bit pushy but excellent otherwise 2g. Your satisfaction with the process of coming to an agreement on pricing.: Were willing to try to bring the price to a level that was acceptable to me. Please provide any additional comments regarding your recent sales experience.: Abel is awesome! Took care of everything from welcoming me into the dealership to making sure I got the car I wanted (even the color)! ")

我想删除之前的所有内容:

我尝试了以下代码,

gsub("^[^:]+:","",review)

但是,它只删除了以冒号结尾的第一句

预期成绩:

Approx. It was very prompt however. Were a little bit pushy but excellent otherwise Were willing to try to bring the price to a level that was acceptable to me. Abel is awesome! Took care of everything from welcoming me into the dealership to making sure I got the car I wanted (even the color)!

任何帮助或建议将不胜感激。谢谢你。

标签: rregexgsub

解决方案


如果句子不复杂且没有缩写,您可以使用

gsub("(?:\\d+[a-zA-Z]\\.)?[^.?!:]*[?!.]:\\s*", "", review)

请参阅正则表达式演示

请注意,您可以通过更改\\d+[a-zA-Z][0-9a-zA-Z]+/[[:alnum:]]+以匹配 1+ 数字或字母来进一步概括它。

细节

  • (?:\d+[a-zA-Z]\.)?- 一个可选的序列
    • \d+- 1+ 位数
    • [a-zA-Z]- 一个 ASCII 字母
    • \.- 一个点
  • [^.?!:]*.- 除, ?, !,之外的0 个或更多字符:
  • [?!.]- 一个?!.
  • :- 一个冒号
  • \s*- 0+ 个空格

R测试:

> gsub("(?:\\d+[a-zA-Z]\\.)?[^.?!:]*[?!.]:\\s*", "", review)
[1] "Approx. It was very prompt however. Were a little bit pushy but excellent otherwise Were willing to try to bring the price to a level that was acceptable to me.Abel is awesome! Took care of everything from welcoming me into the dealership to making sure I got the car I wanted (even the color)! "

扩展以处理缩写

如果添加交替,您可以枚举异常:

gsub("(?:\\d+[a-zA-Z]\\.)?(?:i\\.?e\\.|[^.?!:])*[?!.]:\\s*", "", review)     
                          ^^^^^^^^^^^^^^^^^^^^^^ 

在这里,(?:i\.?e\.|[^.?!:])*匹配 0 个或多个ie.或子字符串或除、或之外的i.e.任何字符。.?!:

请参阅此演示


推荐阅读