我是一个使用 Spirit 的新手。

我正在尝试使用 Spirit x3 从一个简单的“excel”公式构建一个 AST 树。语法支持典型的运算符(+、-、*、/)、函数(myfunc(myparam1, myparam2))和单元格引用(例如A1、AA234)。

因此,要解析的示例表达式可能是 A1 + sin(A2+3)。

问题是下面的 xlreference 规则永远不会匹配,因为 xlfunction 规则优先并且该规则不会回溯。我已经尝试过期望,但我缺乏一些好的例子来实现它。

我想这会引出另一个问题,即调试 x3 的最佳方法是什么。我已经看到了 BOOST_SPIRIT_X3_DEBUG 的定义,但我找不到任何例子来证明它的用法。我还在 expression_class 上写了一个 on_error 方法,但这并不能提供良好的跟踪。我曾尝试使用位置标记和 with 语句,但这也没有提供足够的信息。


x3::rule<class xlreference, ast::xlreference> const   xlreference{"xlreference"};
auto const xlreference_def = +alpha  > x3::uint_ > !x3::expect[char('(')];

struct identifier_class;
typedef x3::rule<identifier_class, std::string> identifier_type;
identifier_type const identifier = "identifier";
auto const identifier_def = x3::char_("a-zA-Z") > *(x3::char_("a-zA-Z") | x3::char_('_')) > !x3::expect[char('0-9')];

auto const expression_def = // constadditive_expr_def 
    term [print_action()]
    >> *(   (char_('+') > term)
        |   (char_('-') > term)

x3::rule<xlfunction_class, ast::xlfunction> const xlfunction("xlfunction");
auto const xlfunction_def = identifier > '(' > *(expression > *(',' > expression)) > ')';


auto const term_def = //constmultiplicative_expr_def 
    >> *(   (char_('*') > factor) 
        |   (char_('/') > factor) 

auto const factor_def = // constunary_expr_def 
    xlfunction [print_action()]
    |   '(' > expression > ')' 
    |   (char_('-') > factor) 
    |   (char_('+') > factor) 
    | x3::double_  [print_action()] | xlreference [print_action()]


    struct expression_class //: x3::annotate_on_success
    //  Our error handler
    template <typename Iterator, typename Exception, typename Context>
    on_error(Iterator& q, Iterator const& last, Exception const& x, Context const& context)
            << "Error! Expecting: "
            << x.which()
            << " here: \""
            << std::string(x.where(), last)
            << "\""
            << std::endl
        return x3::error_handler_result::fail;


client::ast::program ast;
bool r = phrase_parse(iter, (iterator_type const ) str.end(), parser, x3::space, ast);
if (!r) {
    std::cout << "failed:" << str << "\n";

好的,一步一步地进行审查。一路编造 AST 类型(因为你不想展示)。

  1. 这是无效代码:char('0-9')。那是宽字符文字吗?启用编译器警告!您可能的意思是x3::char_("0-9")(两个重要的区别!)。

  2. !x3::expect[]是矛盾的。您永远不能通过该条件,因为!断言前瞻不匹配,而expect[]需要匹配。因此,最佳情况!失败,因为expect[]-ation 匹配。最坏的情况expect[]会引发异常,因为您要求它。

  3. operator >已经是一个期望点了。出于与以前相同的原因,这> !p是一个矛盾。做了>> !p

  4. 替换char_("0-9")x3::digit

  5. 替换char_("a-zA-Z")x3::alpha

  6. 一些(许多)规则需要是词位。那是因为您在船长上下文(phrase_parsewith x3::space)中调用语法。您的标识符会默默地吃掉空格,因为您没有使它们成为词素。请参阅Boost Spirit 船长问题

  7. 负前瞻断言不会暴露属性,所以! char_('(')可以(应该?)! lit('(')

  8. 语义动作的存在(默认情况下)抑制属性传播 - 因此print_action()将导致属性传播停止

  9. 根据定义,期望点 ( operator>) 不能回溯。这就是使他们成为期望点的原因。

  10. 使用 kleene-star 组成的 List 运算符:p >> *(',' >> p)->p % ','

  11. 那个额外的 kleene-star 是假的。你的意思是让参数列表是可选的吗?那是-(expression % ',')

  12. 链式操作符规则使得获取 ast 有点麻烦

  13. 简化

    factor >> *((x3::char_('*') > factor) //
               | (x3::char_('/') > factor));

    只是factor >> *(x3::char_("*/") >> factor);

  14. factor_def逻辑上匹配什么expression


auto const xlreference_def =
    x3::lexeme[+x3::alpha >> x3::uint_] >> !x3::char_('(');

auto const identifier_def =
    x3::raw[x3::lexeme[x3::alpha >> *(x3::alpha | '_') >> !x3::digit]];

auto const xlfunction_def = identifier >> '(' >> -(expression % ',') >> ')';

auto const term_def = factor >> *(x3::char_("*/") >> factor);

auto const factor_def = xlfunction //
    | '(' >> expression >> ')'     //
    | x3::double_                  //
    | xlreference;

auto const expression_def = term >> *(x3::char_("-+") >> term);


  1. (iterator_type const)str.end()?? 永远不要使用 C 风格的强制转换。事实上,不管怎样,只要使用str.cend()或确实str.end()str合适的const

  2. phrase_parse- 考虑不要让船长成为呼叫者的决定,因为它在逻辑上是你语法的一部分

  3. 多种excel表达式不解析:A:A、$A4、B$4、所有单元格范围;我想很多时候R1C1也是支持的


因此,凭借丰富的经验,我将使用 Crystall Ball™ 一些 AST®:

namespace client::ast {
    using identifier = std::string;
    //using identifier = boost::iterator_range<std::string::const_iterator>;
    struct string_literal  : std::string {
        using std::string::string;
        using std::string::operator=;

        friend std::ostream& operator<<(std::ostream& os, string_literal const& sl) {
            return os << std::quoted(sl) ;

    struct xlreference {
        std::string colname;
        size_t      rownum;

    struct xlfunction; // fwd
    struct binary_op;  // fwd

    using expression = boost::variant<        //
        double,                               //
        string_literal,                       //
        identifier,                           //
        xlreference,                          //
        boost::recursive_wrapper<xlfunction>, //
        boost::recursive_wrapper<binary_op>   //

    struct xlfunction{
        identifier              name;
        std::vector<expression> args;

        friend std::ostream& operator<<(std::ostream& os, xlfunction const& xlf)
            os << xlf.name << "(";
            char const* sep = "";
            for (auto& arg : xlf.args)
                os << std::exchange(sep, ", ") << arg;
            return os;

    struct binary_op {
        struct chained_t {
            char       op;
            expression e;
        expression             lhs;
        std::vector<chained_t> chained;

        friend std::ostream& operator<<(std::ostream& os, binary_op const& bop)
            os << "(" << bop.lhs;
            for (auto& rhs : bop.chained)
            os << rhs.op << rhs.e;
            return os << ")";

    using program = expression;
    using boost::fusion::operator<<;
} // namespace client::ast


BOOST_FUSION_ADAPT_STRUCT(client::ast::xlfunction, name, args)
BOOST_FUSION_ADAPT_STRUCT(client::ast::xlreference, colname, rownum)
BOOST_FUSION_ADAPT_STRUCT(client::ast::binary_op, lhs, chained)
BOOST_FUSION_ADAPT_STRUCT(client::ast::binary_op::chained_t, op, e)


x3::rule<struct identifier_class, ast::identifier>  const identifier{"identifier"};
x3::rule<struct xlreference,      ast::xlreference> const xlreference{"xlreference"};
x3::rule<struct xlfunction_class, ast::xlfunction>  const xlfunction{"xlfunction"};
x3::rule<struct factor_class,     ast::expression>  const factor{"factor"};
x3::rule<struct expression_class, ast::binary_op>   const expression{"expression"};
x3::rule<struct term_class,       ast::binary_op>   const term{"term"};


auto const xlreference_def =
    x3::lexeme[+x3::alpha >> x3::uint_] /*>> !x3::char_('(')*/;

请注意,前瞻断言 ( !) 实际上并没有改变解析结果,因为任何单元格引用都不是有效的标识符,所以 () 将保持未解析。

auto const identifier_def =
    x3::raw[x3::lexeme[x3::alpha >> *(x3::alpha | '_') /*>> !x3::digit*/]];


我输入了一个字符串文字,因为任何 Excel 克隆都会有一个:

auto const string_literal =
    x3::rule<struct _, ast::string_literal>{"string_literal"} //
= x3::lexeme['"' > *('\\' >> x3::char_ | ~x3::char_('"')) > '"'];



auto const factor_def =        //
    xlfunction                 //
    | '(' >> expression >> ')' //
    | x3::double_              //
    | string_literal           //
    | xlreference              //
    | identifier               //


auto const term_def       = factor >> *(x3::char_("*/") >>  factor);
auto const expression_def = term >> *(x3::char_("-+") >> term);
auto const xlfunction_def = identifier >> '(' >> -(expression % ',') >> ')';

直接映射到 AST。


'纳夫说。现在出现了一些货物崇拜 - 未显示和未使用的代码残余,我在这里大多只是接受并忽略:

int main() {
    std::vector<int> positions; // TODO

    auto parser = x3::with<struct position_cache_tag /*TODO*/>        //
        (std::ref(positions))                                         //
            [                                                         //
                x3::skip(x3::space)[                                  //
                    client::calculator_grammar::expression >> x3::eoi //
    ]                                                                 //



struct {
    std::string              category;
    std::vector<std::string> cases;
} test_table[] = {
        {"A1", "A1111", "AbCdZ9876543", "i9", "i0"},
        {"i", "id", "id_entifier"},
        {"123", "inf", "-inf", "NaN", ".99e34", "1e-8", "1.e-8", "+9"},
        {                                                       //
         "3+4", "3*4",                                          //
         "3+4+5", "3*4*5", "3+4*5", "3*4+5", "3*4+5",           //
         "3+(4+5)", "3*(4*5)", "3+(4*5)", "3*(4+5)", "3*(4+5)", //
         "(3+4)+5", "(3*4)*5", "(3+4)*5", "(3*4)+5", "(3*4)+5"},
            R"--(IIF(A1, "Red", "Green"))--",
            "A9()", // an xlreference may not be followed by ()
            "",     // you didn't specify
            "A-9",    // 1-letter identifier and binary operation
            "1 + +9", // unary plus accepted in number rule
            "myfunc(myparam1, myparam2)",
            "A1 + sin(A2+3)",


for (auto& [cat, cases] : test_table) {
    for (std::string const& str : cases) {
        auto iter = begin(str), last(end(str));
        std::cout << std::setw(12) << cat << ": ";

        client::ast::program ast;
        if (parse(iter, last, parser, ast)) {
            std::cout << "parsed: " << ast;
        } else {
            std::cout << "failed: " << std::quoted(str);

        if (iter == last) {
            std::cout << "\n";
        } else {
            std::cout << " unparsed: "
                      << std::quoted(std::string_view(iter, last)) << "\n";



#include <boost/fusion/adapted.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/ast/position_tagged.hpp>
#include <boost/spirit/home/x3/support/utility/annotate_on_success.hpp>
#include <iostream>
#include <iomanip>
#include <map>
namespace x3 = boost::spirit::x3;

namespace client::ast {
    using identifier = std::string;
    //using identifier = boost::iterator_range<std::string::const_iterator>;
    struct string_literal  : std::string {
        using std::string::string;
        using std::string::operator=;

        friend std::ostream& operator<<(std::ostream& os, string_literal const& sl) {
            return os << std::quoted(sl) ;

    struct xlreference {
        std::string colname;
        size_t      rownum;

    struct xlfunction; // fwd
    struct binary_op;  // fwd

    using expression = boost::variant<        //
        double,                               //
        string_literal,                       //
        identifier,                           //
        xlreference,                          //
        boost::recursive_wrapper<xlfunction>, //
        boost::recursive_wrapper<binary_op>   //

    struct xlfunction{
        identifier              name;
        std::vector<expression> args;

        friend std::ostream& operator<<(std::ostream& os, xlfunction const& xlf)
            os << xlf.name << "(";
            char const* sep = "";
            for (auto& arg : xlf.args)
                os << std::exchange(sep, ", ") << arg;
            return os;

    struct binary_op {
        struct chained_t {
            char       op;
            expression e;
        expression             lhs;
        std::vector<chained_t> chained;

        friend std::ostream& operator<<(std::ostream& os, binary_op const& bop)
            os << "(" << bop.lhs;
            for (auto& rhs : bop.chained)
            os << rhs.op << rhs.e;
            return os << ")";

    using program = expression;
    using boost::fusion::operator<<;
} // namespace client::ast

BOOST_FUSION_ADAPT_STRUCT(client::ast::xlfunction, name, args)
BOOST_FUSION_ADAPT_STRUCT(client::ast::xlreference, colname, rownum)
BOOST_FUSION_ADAPT_STRUCT(client::ast::binary_op, lhs, chained)
BOOST_FUSION_ADAPT_STRUCT(client::ast::binary_op::chained_t, op, e)

namespace client::calculator_grammar {
    struct expression_class //: x3::annotate_on_success
        //  Our error handler
        template <typename Iterator, typename Exception, typename Context>
        x3::error_handler_result on_error(Iterator& q, Iterator const& last,
                                          Exception const& x,
                                          Context const&   context)
            std::cout                                          //
                << "Error! Expecting: " << x.which()           //
                << " here: \"" << std::string(x.where(), last) //
                << "\"" << std::endl;

            return x3::error_handler_result::fail;

    x3::rule<struct identifier_class, ast::identifier>  const identifier{"identifier"};
    x3::rule<struct xlreference,      ast::xlreference> const xlreference{"xlreference"};
    x3::rule<struct xlfunction_class, ast::xlfunction>  const xlfunction{"xlfunction"};
    x3::rule<struct factor_class,     ast::expression>  const factor{"factor"};
    x3::rule<struct expression_class, ast::binary_op>   const expression{"expression"};
    x3::rule<struct term_class,       ast::binary_op>   const term{"term"};

    auto const xlreference_def =
        x3::lexeme[+x3::alpha >> x3::uint_] /*>> !x3::char_('(')*/;

    auto const identifier_def =
        x3::raw[x3::lexeme[x3::alpha >> *(x3::alpha | '_') /*>> !x3::digit*/]];

    auto const string_literal =
        x3::rule<struct _, ast::string_literal>{"string_literal"} //
    = x3::lexeme['"' > *('\\' >> x3::char_ | ~x3::char_('"')) > '"'];

    auto const factor_def =        //
        xlfunction                 //
        | '(' >> expression >> ')' //
        | x3::double_              //
        | string_literal           //
        | xlreference              //
        | identifier               //

    auto const term_def       = factor >> *(x3::char_("*/") >>  factor);
    auto const expression_def = term >> *(x3::char_("-+") >> term);
    auto const xlfunction_def = identifier >> '(' >> -(expression % ',') >> ')';


} // namespace client::calculator_grammar

int main() {
    std::vector<int> positions; // TODO

    auto parser = x3::with<struct position_cache_tag /*TODO*/>        //
        (std::ref(positions))                                         //
            [                                                         //
                x3::skip(x3::space)[                                  //
                    client::calculator_grammar::expression >> x3::eoi //
    ]                                                                 //

    struct {
        std::string              category;
        std::vector<std::string> cases;
    } test_table[] = {
            {"A1", "A1111", "AbCdZ9876543", "i9", "i0"},
            {"i", "id", "id_entifier"},
            {"123", "inf", "-inf", "NaN", ".99e34", "1e-8", "1.e-8", "+9"},
            {                                                       //
             "3+4", "3*4",                                          //
             "3+4+5", "3*4*5", "3+4*5", "3*4+5", "3*4+5",           //
             "3+(4+5)", "3*(4*5)", "3+(4*5)", "3*(4+5)", "3*(4+5)", //
             "(3+4)+5", "(3*4)*5", "(3+4)*5", "(3*4)+5", "(3*4)+5"},
                R"--(IIF(A1, "Red", "Green"))--",
                "A9()", // an xlreference may not be followed by ()
                "",     // you didn't specify
                "A-9",    // 1-letter identifier and binary operation
                "1 + +9", // unary plus accepted in number rule
                "myfunc(myparam1, myparam2)",
                "A1 + sin(A2+3)",

    for (auto& [cat, cases] : test_table) {
        for (std::string const& str : cases) {
            auto iter = begin(str), last(end(str));
            std::cout << std::setw(12) << cat << ": ";

            client::ast::program ast;
            if (parse(iter, last, parser, ast)) {
                std::cout << "parsed: " << ast;
            } else {
                std::cout << "failed: " << std::quoted(str);

            if (iter == last) {
                std::cout << "\n";
            } else {
                std::cout << " unparsed: "
                          << std::quoted(std::string_view(iter, last)) << "\n";


 xlreference: parsed: (((A 1)))
 xlreference: parsed: (((A 1111)))
 xlreference: parsed: (((AbCdZ 9876543)))
 xlreference: parsed: (((i 9)))
 xlreference: parsed: (((i 0)))
  identifier: parsed: ((i))
  identifier: parsed: ((id))
  identifier: parsed: ((id_entifier))
      number: parsed: ((123))
      number: parsed: ((inf))
      number: parsed: ((-inf))
      number: parsed: ((nan))
      number: parsed: ((9.9e+33))
      number: parsed: ((1e-08))
      number: parsed: ((1e-08))
      number: parsed: ((9))
    binaries: parsed: ((3)+(4))
    binaries: parsed: ((3*4))
    binaries: parsed: ((3)+(4)+(5))
    binaries: parsed: ((3*4*5))
    binaries: parsed: ((3)+(4*5))
    binaries: parsed: ((3*4)+(5))
    binaries: parsed: ((3*4)+(5))
    binaries: parsed: ((3)+(((4)+(5))))
    binaries: parsed: ((3*((4*5))))
    binaries: parsed: ((3)+(((4*5))))
    binaries: parsed: ((3*((4)+(5))))
    binaries: parsed: ((3*((4)+(5))))
    binaries: parsed: ((((3)+(4)))+(5))
    binaries: parsed: ((((3*4))*5))
    binaries: parsed: ((((3)+(4))*5))
    binaries: parsed: ((((3*4)))+(5))
    binaries: parsed: ((((3*4)))+(5))
  xlfunction: parsed: ((pi())
  xlfunction: parsed: ((sin(((4))))
  xlfunction: parsed: ((IIF((((A 1))), (("Red")), (("Green"))))
     invalid: failed: "A9()" unparsed: "A9()"
     invalid: failed: ""
       other: parsed: ((A)-(9))
       other: parsed: ((1)+(9))
    question: parsed: ((myfunc((((myparam 1))), (((myparam 2)))))
    question: parsed: (((A 1)))
    question: parsed: (((AA 234)))
    question: parsed: (((A 1))+(sin((((A 2))+(3))))






    question: <expression>
  <try>A1 + sin(A2+3)</try>
    <try>A1 + sin(A2+3)</try>
      <try>A1 + sin(A2+3)</try>
        <try>A1 + sin(A2+3)</try>
          <try>A1 + sin(A2+3)</try>
          <success>1 + sin(A2+3)</success>
        <try>A1 + sin(A2+3)</try>
        <try>A1 + sin(A2+3)</try>
        <success> + sin(A2+3)</success>
        <attributes>[[A], 1]</attributes>
      <success> + sin(A2+3)</success>
      <attributes>[[A], 1]</attributes>
    <success> + sin(A2+3)</success>
    <attributes>[[[A], 1], []]</attributes>
    <try> sin(A2+3)</try>
      <try> sin(A2+3)</try>
        <try> sin(A2+3)</try>
          <try> sin(A2+3)</try>
          <attributes>[s, i, n]</attributes>
                <attributes>[[A], 2]</attributes>
              <attributes>[[A], 2]</attributes>
            <attributes>[[[A], 2], []]</attributes>
            <attributes>[3, []]</attributes>
          <attributes>[[[[A], 2], []], [[+, [3, []]]]]</attributes>
        <attributes>[[s, i, n], [[[[[A], 2], []], [[+, [3, []]]]]]]</attributes>
      <attributes>[[s, i, n], [[[[[A], 2], []], [[+, [3, []]]]]]]</attributes>
    <attributes>[[[s, i, n], [[[[[A], 2], []], [[+, [3, []]]]]]], []]</attributes>
  <attributes>[[[[A], 1], []], [[+, [[[s, i, n], [[[[[A], 2], []], [[+, [3, []]]]]]], []]]]]</attributes>
parsed: (((A 1))+(sin((((A 2))+(3))))
