首页 > 解决方案 > Antlr 深度规则集性能问题

问题描述

我尝试制作一种语法来理解类 C# 语言的表达式优先级:

var a = expression0.expression1(expression2 + expression3() * expression4)

当正确的优先级变为:

var a = (expression0.expression1)(expression2 + ((expression3()) * expression4))

为此,我将表达式按优先级排序为规则。这是我语法的相关摘录:

expression: assignmentExpression;
assignmentExpression
    : equalityExpression ASSIGNMENT assignmentExpression
    | equalityExpression
    ;
equalityExpression
    : logicalExpression equals equalityExpression
    | logicalExpression notEquals equalityExpression
    | logicalExpression
    ;
logicalExpression
    : relationalExpression and logicalExpression
    | relationalExpression or logicalExpression
    | relationalExpression
    ;
relationalExpression
    : addExpression greaterThan relationalExpression
    | addExpression lessThan relationalExpression
    | addExpression greaterOrEquals relationalExpression
    | addExpression lessOrEquals relationalExpression
    | addExpression
    ;
addExpression
    : multiplyExpression add addExpression
    | multiplyExpression subtract addExpression
    | multiplyExpression
    ;
multiplyExpression
    : conversionExpression multiply multiplyExpression
    | conversionExpression divide multiplyExpression
    | conversionExpression
    ;
conversionExpression
    : unaryExpression AS conversionExpression
    | unaryExpression
    ;
unaryExpression
    : subtract unaryExpression
    | add unaryExpression
    | ifExpression
    ;
ifExpression
    : IF PAREN expression ENDPAREN expression (ELSE expression)?
    | instantiationExpression
    ;
instantiationExpression
    : NEW invocationExpression
    | invocationExpression
    ;
invocationExpression
    : reachExpression PAREN arguments? ENDPAREN
    | reachExpression
    ;
reachExpression
    : primeExpression DOT primeExpression
    | primeExpression
    ;

在以下代码中使用它:

{ int a = 7 }

产生以下输出:

expression
 assignmentExpression
  equalityExpression
   logicalExpression
    relationalExpression
     addExpression
      multiplyExpression
       conversionExpression
        unaryExpression
         ifExpression
          instantiationExpression
           invocationExpression
            reachExpression
             primeExpression
              blockExpression
               "{"
               statement
                variableDeclaration
                 variableType
                  type
                   identifier
                    "int"
                 identifier
                  "a"
                 variableInitialization
                  "="
                  expression
                   assignmentExpression
                    equalityExpression
                     logicalExpression
                      relationalExpression
                       addExpression
                        multiplyExpression
                         conversionExpression
                          unaryExpression
                           ifExpression
                            instantiationExpression
                             invocationExpression
                              reachExpression
                               primeExpression
                                literal
                                 integer
                                  "7"
               "}"

它可能有问题,但我无法真正测试它,因为这种语法会导致疯狂的性能损失。我之前从左到右解析表达式的语法需要几毫秒来解析,而这个需要整整 2 分钟。

我已经把它追查到了ParserATNSimulator.AdaptivePredict,而且在内心深处,这个问题是一个疯狂的深层堆栈

ParserATNSimulator.Closure
ParserATNSimulator.ClosureCheckingStopState
ParserATNSimulator.Closure_

而不是为了得到正确的表达式(数字字面量:7)而去一次兔子洞,对于每个规则级别,它似乎一直向下一次。因此,这不是 O(n),而是需要 O(n^n) 才能到达那个该死的“7”。

这是 Antlr 的问题,是我的语法问题还是问题本身?


编辑:好的,我已经设法通过以这种风格重写我的语法来消除性能问题:

assignmentExpression: equalityExpression (ASSIGNMENT assignmentExpression)?;

但我仍然不明白为什么会发生这种情况以及这种变化是如何解决问题的。

标签: antlr4context-free-grammar

解决方案


推荐阅读