c++ - 提升正则表达式以匹配 IF 语句
问题描述
我需要编写一个 boost 正则表达式来匹配以下字符串,并根据 IF 块的参数将其分成三个标记
=IF(ISNUMBER(SEARCH("Windows",GETWORKSPACE(1))),ON.TIME(NOW()+"00:00:02","abcdef"),CLOSE(TRUE))
理想情况下,这些应该来
token1 = "ISNUMBER(SEARCH("Windows",GETWORKSPACE(1)))"
token2 = "ON.TIME(NOW()+"00:00:02","abcdef")"
token3 = "CLOSE(TRUE)"
我最初写了一个简单的正则表达式 "(?<=\=IF\()(. ),(. ),(.*)(?=\))" 给出不正确的标记,因为贪婪的限定符需要太多的第一个令牌。我目前正在
token1 = "ISNUMBER(SEARCH("Windows",GETWORKSPACE(1))),ON.TIME(NOW()+"00:00:02""
token2 = ""abcdef")"
token3 = "CLOSE(TRUE)"
也试过"(?<=\\=IF\\()([A-Za-z(),:\"]*?),([A-Za-z(),.:\"]*?),([A-Z(),:\"]*?)(?=\\))"
没有运气。有人可以建议一个正则表达式吗?
解决方案
你需要一个简单的解析器。
这是我最喜欢的用于快速解析器的 Boost 瑞士军刀。
我创建了一个非常灵活的“标记”语法,它尊重(嵌套)括号和双引号字符串文字(可能带有嵌入的转义引号和括号):
token = raw [ *(
'(' >> -token_list >> ')'
| '[' >> -token_list >> ']'
| '{' >> -token_list >> '}'
| string_literal
| lexeme[ + ~char_(")]}([{\"',") ]
) ];
其中 token_list 和 string_literal 定义为
string_literal = lexeme [
'"' >> *('\\' >> char_ | ~char_('"')) >> '"'
];
token_list = token % ',';
现在 an 的解析器表达式=IF(condition, true_part, false_part)
很简单:
if_expr
= '=' >> no_case["if"]
>> '(' >> token >> ',' >> token >> ',' >> token >> ')';
为了好玩,我使
IF
关键字不区分大小写
演示
//#define BOOST_SPIRIT_X3_DEBUG
#include <boost/spirit/home/x3.hpp>
#include <boost/fusion/adapted/std_tuple.hpp>
#include <iostream>
#include <iomanip>
namespace x3 = boost::spirit::x3;
namespace parser {
using namespace x3;
static rule<struct token_, std::string> const token = "token";
static auto const string_literal = lexeme [
'"' >> *('\\' >> char_ | ~char_('"')) >> '"'
];
static auto const token_list = token % ',';
static auto const token_def = raw [ *(
'(' >> -token_list >> ')'
| '[' >> -token_list >> ']'
| '{' >> -token_list >> '}'
| string_literal
| +~char_(")]}([{\"',") // glue together everything else
) ];
BOOST_SPIRIT_DEFINE(token)
static auto const if_expr
= '=' >> no_case["if"]
>> '(' >> token >> ',' >> token >> ',' >> token >> ')';
}
int main() {
for (std::string const& input : {
R"(=IF(ISNUMBER,ON.TIME,CLOSE))",
R"(=IF(ISNUMBER(SEARCH("Windows")),ON.TIME(NOW()+"00:00:02","abcdef"),CLOSE(TRUE)))",
R"(=IF(ISNUMBER(SEARCH("Windows",GETWORKSPACE(1))),ON.TIME(NOW()+"00:00:02","abcdef"),CLOSE(TRUE)))",
" = if( isnumber, on .time, close ) ",
R"( = if( "foo, bar", if( isnumber, on .time, close ), IF("[ISN(UM}B\"ER")) )",
})
{
auto f = input.begin(), l = input.end();
std::cout << "=== " << std::quoted(input) << ":\n";
std::string condition, true_part, false_part;
auto attr = std::tie(condition, true_part, false_part);
if (phrase_parse(f, l, parser::if_expr, x3::blank, attr)) {
std::cout << "Parsed: \n"
<< " - condition: " << std::quoted(condition) << "\n"
<< " - true_part: " << std::quoted(true_part) << "\n"
<< " - false_part: " << std::quoted(false_part) << "\n";
} else {
std::cout << "Parse failed\n";
}
if (f!=l) {
std::cout << "Remaining unparsed: " << std::quoted(std::string(f,l)) << "\n";
}
}
}
印刷
=== "=IF(ISNUMBER,ON.TIME,CLOSE)":
Parsed:
- condition: "ISNUMBER"
- true_part: "ON.TIME"
- false_part: "CLOSE"
=== "=IF(ISNUMBER(SEARCH(\"Windows\")),ON.TIME(NOW()+\"00:00:02\",\"abcdef\"),CLOSE(TRUE))":
Parsed:
- condition: "ISNUMBER(SEARCH(\"Windows\"))"
- true_part: "ON.TIME(NOW()+\"00:00:02\",\"abcdef\")"
- false_part: "CLOSE(TRUE)"
=== "=IF(ISNUMBER(SEARCH(\"Windows\",GETWORKSPACE(1))),ON.TIME(NOW()+\"00:00:02\",\"abcdef\"),CLOSE(TRUE))":
Parsed:
- condition: "ISNUMBER(SEARCH(\"Windows\",GETWORKSPACE(1)))"
- true_part: "ON.TIME(NOW()+\"00:00:02\",\"abcdef\")"
- false_part: "CLOSE(TRUE)"
=== " = if( isnumber, on .time, close ) ":
Parsed:
- condition: "isnumber"
- true_part: "on .time"
- false_part: "close "
=== " = if( \"foo, bar\", if( isnumber, on .time, close ), IF(\"[ISN(UM}B\\\"ER\")) ":
Parsed:
- condition: "\"foo, bar\""
- true_part: "if( isnumber, on .time, close )"
- false_part: "IF(\"[ISN(UM}B\\\"ER\")"
推荐阅读
- angular - 无法为奇数行和偶数行设置单独的颜色
- kotlin - 排除会员声明
- ios - Fullscreen mode is disabled for an embedded YouTube video in WKWebView
- python - 替换字符串中的问题
- php - 将定制的幻灯片添加到 Wordpress 主题
- php - 错误:14094410:SSL 例程:ssl3_read_bytes:sslv3 警报握手失败 PHP curl
- sql - 错误类型错误的 ADODB 命令值
- php - 在 AJAX 调用中显示来自 PHP 的消息
- node.js - 如果不存在,Mongoose.connect 不会创建数据库
- html - 使用 CSS 更改多图像背景中背景图像之一的亮度?