首页 > 解决方案 > Python AST - 检测表达式中的非法标记

问题描述

我应该如何优化compile()函数(解析器)以:

  1. 只接受标记:'numeric string (integer)', ADD, SUB, MUL, 和DIV

  2. 检测带有错误显示的非法令牌/表达式(立即停止):

    ERROR: Unexpected token at # 4

  3. ADD, SUB, MUL, 并且DIV必须恰好有两个标记:

['numeric string (integer)'ADDSUBMULDIV]

['numeric string (integer)'ADDSUBMULDIV]

  1. 该函数应该从最内向的表达式开始计算表达式。

代码:

def is_operator(token):
    return (token in ['ADD', 'SUB', 'MUL', 'DIV'] or token.isnumeric())


# Compiles a list of tokens
# into an abstract syntax tree (AST). 
# 
# A token represents either a number (e.g. "42") 
# or an operator ("ADD", "SUB", "MUL" or "DIV"). 
# 
# A node of the returned AST is either a string 
# that represents a number (e.g. "42") 
# or a tuple in the form of (operator, child1, child2) 
# where child1 and child2 are the subtrees of the AST. 
# 
# This function returns (ast, unprocessed_tokens) 
# where unprocessed_tokens are the tokens yet to compile. 
# 
def compile(tokens):
    if len(tokens) == 0:
        return ('NOOP',)
    token = tokens[0]
    if is_operator(token):
        child1, unprocessed = compile(tokens[1:])
        child2, unprocessed = compile(unprocessed)
        return (token, child1, child2), unprocessed
    else:
        return token, tokens[1:]


line = ['ADD', 'SUB', '2', '2', 'MUL', '6', 'DIV', '10', '2']
print(compile(line)[0])
# => ('ADD', ('SUB', '2', '2'), ('MUL', '6', ('DIV', '10', '2')))

样品#1:

输入:

line = ['ADD', 'SUB', '3', '2', 'MUL', '6', 'DIV', '10', '2']

预期输出:

AST: ('ADD', ('SUB', '3', '2'), ('MUL', '6', ('DIV', '10', '2')))

ANSWER: 31

样品#2:

输入:

line = ['ADD', '3', '2', '1']

预期输出:

ERROR: Unexpected token at # 4

PS:

我知道解析库,我不会朝那个方向走。我需要通过使用本机 python 代码来解决它。

我已经编写了lexer()函数。

标签: pythonalgorithmrecursionabstract-syntax-tree

解决方案


推荐阅读