首页 > 解决方案 > 如果包含字符串,如何删除整个句子

问题描述

如果它包含一个模式,我需要从字符串中删除整个句子。这里我有模式“链接”或“链接”,如果它存在于字符串中,我需要删除包含它的整个句子。

std::string subject = "This is previous sentence. This can be any sentences. Link 2.1.19.3 [Example]. This is can be any other sentence. This is next sentence.";   

std::string removeRedundantString(std::string subject)
{
    std::string removeSee = subject;
    std::smatch match;  

    std::regex redundantSee("(Link.*$)");

    if (std::regex_search(subject, match, redundantSee))
    {
        removeSee = std::regex_replace(subject, redundantSee, "");
    }
}

预期输出:

This is previous sentence. This can be any sentences.This is can be any other sentence. This is next sentence.

实际输出:

This is previous sentence. This can be any sentences.

上面的实际输出是因为使用了正则表达式"(Link.*$)",它删除了从链接开始到字符串末尾的句子。我无法弄清楚使用什么正则表达式来获得预期的输出。以下是我需要测试的不同测试用例:

测试用例 1:

std::string subject = "Note this is second pattern, Ops that next the scheduler; link the amount for the full list of docs. The number of value varies from 0 to 4.";

输出:Note this is second pattern, Ops that next the scheduler;The number of value varies from 0 to 4.

测试用例 2:

std::string subject = "This is another pattern. (Link Doc::78::hello::Core::mount). Since this patern includes non-numeric value.";

输出 :This is another pattern.Since this patern includes non-numeric value.

任何帮助,将不胜感激。

标签: c++regexc++11

解决方案


我会推荐

std::regex redundantSee(R"(\W*\b[Ll]ink\b(?:\d+(?:\.\d+)*|[^.])*[.?!])")

查看其在线演示。请注意原始字符串文字语法R"(...)". 字符串模式可以简单地放在里面而不是...没有任何额外的转义。

正则表达式详细信息

  • \W*- 零个或多个非单词字符
  • \b- 单词边界
  • [Ll]ink-Linklink单词
  • \b- 单词边界
  • (?:\d+(?:\.\d+)*|[^.])*- 零个或多个序列
    • \d+(?:\.\d+)*- 一个或多个数字后跟零个或多个序列.和一个或多个数字
    • |- 或者
    • [^.]- 除 a 以外的任何字符.
  • [.?!]- 一个?.!

推荐阅读