首页 > 解决方案 > 由于命名捕获组中的“未封闭字符类”,正则表达式不会编译

问题描述

我在 Rust 正则表达式中收到“错误:未关闭的字符类”。使用符合 PCRE 的正则表达式的在线正则表达式测试器测试正则表达式工作正常,但在 Rust Playground 上使用正则表达式箱会出错。

字符类必须包含减号。我尝试将减号放在第一个位置,最后一个位置并完全忽略它,但总是出错。

对于大多数预期的输入,源字符串对于某些操作和某些非负整数将是“op(number)”。对于少数人,我期望“op(number/number/number)”。

如果有一种更好的方法来提取命名的捕获,我会全力以赴。

use lazy_static::lazy_static;
use regex::Regex;

fn main() {
    lazy_static! {
        static ref FANCY_OPCODE_RE: Regex = Regex::new(r"(?x)
            ^                              # Match start of string
            (?P<opname>[-a-zA-Z#+]+)       # Match abbreviated name of OpCode as 'opname'
            \(                             # Open parentheses
            (?P<arg1>[0-9]+)               # Match first number as 'arg1'
            (/                             # Delimiter
            (?P<arg2>[0-9]+)               # Optionally match second number as 'arg2'
            /                              # Delimiter
            (?P<arg3>[0-9]+))?             # Optionally match third number as 'arg3'
            \)                             # Closing parenthesis
            $                              # Match end of string
        ").unwrap();
    }
    let s = "+loop(3)";
    let opname: String; 
    let arg1: String;
    let arg2: String;
    let arg3: String;
    match FANCY_OPCODE_RE.captures(s) {
        Some(cap) => { 
            opname = format!("{:?}", cap.name("opname")); 
            arg1 = format!("{:?}", cap.name("arg1"));
            arg2 = format!("{:?}", cap.name("arg2"));
            arg3 = format!("{:?}", cap.name("arg3"));
        },
        None => { 
            opname = "No match".to_string(); 
            arg1 = String::new();
            arg2 = String::new();
            arg3 = String::new();
        }
    }

    println!("opname = {}, arg1 = {}, arg2 = {}, arg3 = {}", opname, arg1, arg2, arg3);
}

这是错误消息:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Syntax(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
regex parse error:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1: (?x)
 2:             ^                              # Match start of string
 3:             (?P<opname>[-a-zA-Z#+]+)       # Match abbreviated name of OpCode as 'opname'
                           ^^
 4:             \(                             # Open parentheses
 5:             (?P<arg1>[0-9]+)               # Match first number as 'arg1'
 6:             (/                             # Delimiter
 7:             (?P<arg2>[0-9]+)               # Optionally match second number as 'arg2'
 8:             /                              # Delimiter
 9:             (?P<arg3>[0-9]+))?             # Optionally match third number as 'arg3'
10:             \)                             # Closing parenthesis
11:             $                              # Match end of string
12:         
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
error: unclosed character class
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
)', src/main.rs:17:12

标签: regexrustnamed-captures

解决方案


在调试问题时,创建一个最小的、可重现的示例很有用。通过删除不会导致问题的部分正则表达式,您可以快速减少到:

Regex::new(r"(?x)(?P<opname>[-a-zA-Z#+]+)").unwrap();

问题是您# 正则表达式中包含了注释字符。逃脱它:

[-a-zA-Z\#+]

推荐阅读