首页 > 解决方案 > antlr4 python: listener does not show everything when parsing json

问题描述

I am using the g4 json grammar given here:

grammar JSON;

json
   : value
   ;

obj
   : '{' pair (',' pair)* '}'
   | '{' '}'
   ;

pair
   : STRING ':' value
   ;

array
   : '[' value (',' value)* ']'
   | '[' ']'
   ;

value
   : STRING
   | NUMBER
   | obj
   | array
   | 'true'
   | 'false'
   | 'null'
   ;


STRING
   : '"' (ESC | SAFECODEPOINT)* '"'
   ;


fragment ESC
   : '\\' (["\\/bfnrt] | UNICODE)
   ;
fragment UNICODE
   : 'u' HEX HEX HEX HEX
   ;
fragment HEX
   : [0-9a-fA-F]
   ;
fragment SAFECODEPOINT
   : ~ ["\\\u0000-\u001F]
   ;


NUMBER
   : '-'? INT ('.' [0-9] +)? EXP?
   ;


fragment INT
   : '0' | [1-9] [0-9]*
   ;

// no leading zeros

fragment EXP
   : [Ee] [+\-]? INT
   ;

// \- since - means "range" inside [...]

WS
   : [ \t\n\r] + -> skip
;

This is below a json sample from Wikipedia that I would like to parse using the grammar above:

to_parse = r'''
{
  "firstName": "John",
  "lastName": "Smith",
  "isAlive": true,
  "age": 27,
  "address": {
    "streetAddress": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postalCode": "10021-3100"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "office",
      "number": "646 555-4567"
    },
    {
      "type": "mobile",
      "number": "123 456-7890"
    }
  ],
  "children": [],
  "spouse": null
}
'''

I am using the antlr4 python runtime (after having generated the lexer[lexer_class], parser[parser_class] and listener[listener_class]):

class MyListener(listener_class):

    def enterJson(self, ctx):
        print(inspect.stack()[0][3])

    def exitJson(self, ctx):
        print(inspect.stack()[0][3])

    def enterObj(self, ctx):
        print(inspect.stack()[0][3])

    def exitObj(self, ctx):
        print(inspect.stack()[0][3])

    def enterPair(self, ctx):
        print(inspect.stack()[0][3])

    def exitPair(self, ctx):
        print(inspect.stack()[0][3])

    def enterArray(self, ctx):
        print(inspect.stack()[0][3])

    def exitArray(self, ctx):
        print(inspect.stack()[0][3])

    def enterValue(self, ctx):
        print(inspect.stack()[0][3])

    def exitValue(self, ctx):
        print(inspect.stack()[0][3])


input_stream = InputStream(to_parse)
lexer = lexer_class(input_stream)
token_stream = CommonTokenStream(lexer)
parser = parser_class(token_stream)
# Entry point in the json g4 grammar: json
tree = parser.json()
my_listener = MyListener()
walker = ParseTreeWalker()
walker.walk(my_listener, tree)

Only outputs:

enterJson
enterValue
exitValue
exitJson

Is it normal that my code does not show any array, obj?

[EDIT]

I am using the following command to generate the *.py files (lexer, parser and listener):

java -cp antlr-4.7.2-complete.jar org.antlr.v4.Tool -o ./generation -Dlanguage=Python3 JSON.g4

标签: pythonjsonantlrantlr4

解决方案


我无法重现。当我从您发布的语法生成类时,然后运行以下脚本:

from antlr4 import *
from JSONLexer import JSONLexer as lexer_class
from JSONParser import JSONParser as parser_class
from JSONListener import JSONListener as listener_class


class MyListener(listener_class):

    def enterJson(self, ctx):
        print("enterJson")

    def exitJson(self, ctx):
        print("exitJson")

    def enterObj(self, ctx):
        print("enterObj")

    def exitObj(self, ctx):
        print("exitObj")

    def enterPair(self, ctx):
        print("enterPair")

    def exitPair(self, ctx):
        print("exitPair")

    def enterArray(self, ctx):
        print("enterArray")

    def exitArray(self, ctx):
        print("exitArray")

    def enterValue(self, ctx):
        print("enterValue")

    def exitValue(self, ctx):
        print("exitValue")


to_parse = r'''
{
  "firstName": "John",
  "lastName": "Smith",
  "isAlive": true,
  "age": 27,
  "address": {
    "streetAddress": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postalCode": "10021-3100"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "office",
      "number": "646 555-4567"
    },
    {
      "type": "mobile",
      "number": "123 456-7890"
    }
  ],
  "children": [],
  "spouse": null
}
'''

input_stream = InputStream(to_parse)
lexer = lexer_class(input_stream)
token_stream = CommonTokenStream(lexer)
parser = parser_class(token_stream)
tree = parser.json()
my_listener = MyListener()
walker = ParseTreeWalker()
walker.walk(my_listener, tree)

以下内容打印到我的控制台:

enterJson
enterValue
enterObj
enterPair
enterValue
exitValue
exitPair
enterPair
enterValue
exitValue
exitPair
enterPair
enterValue
exitValue
exitPair
enterPair
enterValue
exitValue
exitPair
enterPair
enterValue
enterObj
enterPair
enterValue
exitValue
exitPair
enterPair
enterValue
exitValue
exitPair
enterPair
enterValue
exitValue
exitPair
enterPair
enterValue
exitValue
exitPair
exitObj
exitValue
exitPair
enterPair
enterValue
enterArray
enterValue
enterObj
enterPair
enterValue
exitValue
exitPair
enterPair
enterValue
exitValue
exitPair
exitObj
exitValue
enterValue
enterObj
enterPair
enterValue
exitValue
exitPair
enterPair
enterValue
exitValue
exitPair
exitObj
exitValue
enterValue
enterObj
enterPair
enterValue
exitValue
exitPair
enterPair
enterValue
exitValue
exitPair
exitObj
exitValue
exitArray
exitValue
exitPair
enterPair
enterValue
enterArray
exitArray
exitValue
exitPair
enterPair
enterValue
exitValue
exitPair
exitObj
exitValue
exitJson

推荐阅读