regex - 如何有选择地捕获正则表达式值?
问题描述
我知道以前有人问过这个问题。但我似乎无法找到解决方案:
这是测试字符串
value: value1, Do not include this
value: value2
这是我的正则表达式:value: (.*)(?:, Do not include this)?
结果应该捕获
value1
value2
但相反,它捕捉到了这一点
value1, Do not include this
value2
[编辑] 基于评论和答案。让我澄清一下。
如果这是测试字符串
value: value1, Do not include this
value: value1, test,
value: man, this is bad!!, Do not include this
那么捕获的值应该是这样的:
value1
value1, test, test,
man, this is bad!!
解决方案
value: (.*)(?:, Do not include this)?
---- ~~~~~~~~~~~~~~~~~~~~~~~~
A B
The problem with your expression, is, that part A is allowed to match the whole line and part B is optional. The regex engine, upon encountering A, will simply jump to the end of the line it is currently matching against and consume all characters on the way. Then, having matched A, it will advance to part B of the expression, see that it can't be matched (because the whole line was already consumed) and that it is optional, and, this being the end of the expression, stop this attempt and declare the match successful.
One way to prevent this from happening, would be to make part A lazy while forcing the expression to match the whole line by using an end-of-line anchor. For example:
value: (.*?)(?:, Do not include this)?$
See demo.
You could also make part A and B so distinct from each other, that you don't have to worry about one matching in place of the other. If applicable, this would allow you to keep the greedy quantifier for part A. For example:
value: ([^,]*)(?:, Do not include this)?
Which way is more suitable to your needs depends on the composition of the strings you match against.
推荐阅读
- video - 使用 DirectShow 写入文件时的帧丢失
- javascript - 在视图 ASP.NET MVC 中添加字段
- c# - 如何对标志 bigint 列设置的 T-SQL 进行分组
- windows - 在为文件夹调用 GetFileAttributes() 时获取“访问被拒绝”
- vb.net - Roslyn VB.Net 指定编译器版本
- bazel - bazel:使用 WSL 的 bash 而不是 MSYS64
- html - 一个div下的多个span标签添加了额外的空格
- netbeans-8 - 需要 Maatwebsite Excel“从视图”代码的解释
- jquery - JQuery datepicker 在多 jsp 页面中不起作用
- c# - 如何使用属于该类类型并使用另一个类调用的变量来调用在类下定义的变量?