java - Java REGEX,从字符串中删除两种不同类型的注释
问题描述
我的文本有两种类型的评论。由 分隔的%
以及以 开头/*
和结尾的*/
。例如:
输入1:
Sarah was going out. % Remember she usually doesn't go out % It was very cold.
DESIRED_OUTPUT1:
Sarah was going out. It was very cold.
输入 2:
Sarah was going out. /* Remember she usually doesn't go out */ It was very cold.
DESIRED_OUTPUT2:
Sarah was going out. It was very cold.
输入 3:
Charles knocked on the door and a woman opened it. % Hmm, is this good... /* Not sure */ Perhaps this should happen in chapter 10 instead? % She looked at him. - Yes?, she said.
DESIRED_OUTPUT3:
Charles knocked on the door and a woman opened it. She looked at him. - Yes?, she said.
输入4:
Charles knocked on the door and a woman opened it. % Hmm, is this good... /* Not sure to 100% */ Perhaps this should happen in chapter 10 instead? % She looked at him. - Yes?, she said.
DESIRED_OUTPUT4:
Charles knocked on the door and a woman opened it. */ Perhaps this should happen in chapter 10 instead?
基本上,我希望在遇到开始注释标记时,所有内容都被删除,直到其各自的结束注释标记(即使这意味着删除其他类型的注释标记)。
如果使用%
或打开注释/*
但从未关闭,则假定注释将持续到文本结尾。但是,如果它只是这种类型的结束标记*/
(因为开启者在另一个评论中,因此被删除),它应该留在文本中。
解决方案
您可以使用
.replaceAll("%[^%]*%?|/\\*[^*]*(?:\\*(?!/)[^*]*)*(?:\\*/)?","")
查看正则表达式演示
细节
%[^%]*%?
-%...%
像带有可选尾随分隔符的注释:%
- 一个%
字符[^%]*
- 0 个或更多字符%
%?
- 一个可选%
字符
|
- 或者/\*[^*]*(?:\*(?!/)[^*]*)*(?:\*/)?
-/*...*/
像带有可选尾随分隔符的注释:/\*
-/*
字符串[^*]*
- 0 个或更多字符*
(?:\*(?!/)[^*]*)*
- 0 次或多次出现\*(?!/)
- a*
不跟随/
[^*]*
- 0 个或更多字符*
(?:\*/)?
- 一个可选的*/
子字符串。
推荐阅读
- shell - 如何在 systemverilog 文件中运行 shell 脚本
- postgresql - 创建 PostgreSQL 视图以提供具有过滤选项的图表生成工具
- c - 是否可以链接共享库(来自另一个共享库),而不使其符号全局可见?
- java - 无法使用 sudo 以其他用户身份执行命令
- vba - 访问vba树视图(MSComctlLib.TreeCtrl.2类)如何为每个节点添加工具提示
- apache-kafka - 如何在单个组中增加 KAFKA 按需消费
- react-native - 使用 createStackNavigator 得到错误:null 不是对象(评估 't(r(d[1])).default.direction')
- android - 我可以检测到 android 手机中的所有应用程序都在后台运行吗?
- c++ - QGridLayout addWidget(CustomWidget)不起作用
- node.js - 我应该在 mongoDB 中使用什么数据结构