首页 > 解决方案 > Parsing Infix Mathematical Expressions in Swift Using Regular Expressions

问题描述

I would like to convert a string that is formatted as an infix mathematical to an array of tokens, using regular expressions. I'm very new to regular expressions, so forgive me if the answer to this question turns out to be too trivial

For example:

"31+2--3*43.8/1%(1*2)" -> ["31", "+", "2", "-", "-3", "*", "43.8", "/", "1", "%", "(", "*", "2", ")"]

I've already implemented a method that achieves this task, however, it consists of many lines of code and a few nested loops. I figured that when I define more operators/functions that may even consist of multiple characters, such as log or cos, it would be easier to edit a regex string rather than adding many more lines of code to my working function. Are regular expressions the right job for this, and if so, where am I going wrong? Or am I better off adding to my working parser?

I've already referred to the following SO posts:

How to split a string, but also keep the delimiters?

This one was very helpful, but I don't believe I'm using 'lookahead' correctly.

Validate mathematical expressions using regular expression?

The solution to the question above doesn't convert the string into an array of tokens. Rather, it checks to see if the given string is a valid mathematical expression.

My code is as follows:

func convertToInfixTokens(expression: String) -> [String]?
{
    do
    {
        let pattern = "^(((?=[+-/*]))(-)?\\d+(\\.\\d+)?)*"

        let regex = try NSRegularExpression(pattern: pattern)

        let results = regex.matches(in: expression, range: NSRange(expression.startIndex..., in: expression))

        return results.map
        {
            String(expression[Range($0.range, in: expression)!])
        }
    }
    catch
    {
        return nil
    }
}

When I do pass a valid infix expression to this function, it returns nil. Where am I going wrong with my regex string?

NOTE: I haven't even gotten to the point of trying to parse parentheses as individual tokens. I'm still figuring out why it won't work on this expression:

"-99+44+2+-3/3.2-6"

Any feedback is appreciated, thanks!

标签: swiftregex

解决方案


Your pattern does not work because it only matches text at the start of the string (see ^ anchor), then the (?=[+-/*]) positive lookahead requires the first char to be an operator from the specified set but the only operator that you consume is an optional -. So, when * tries to match the enclosed pattern sequence the second time with -99+44+2+-3/3.2-6, it sees +44 and -?\d fails to match it (as it does not know how to match + with -?).

Here is how your regex matches the string:

enter image description here

You may tokenize the expression using

let pattern = "(?<!\\d)-?\\d+(?:\\.\\d+)?|[-+*/%()]"

See the regex demo

Details

  • (?<!\d) - there should be no digit immediately to the left of the current position
  • -? - an optional -
  • \d+ - 1 or more digits
  • (?:\.\d+)? - an optional sequence of . and 1+ digits
  • | - or
  • \D - any char but a digit.

Output using your function:

Optional(["31", "+", "2", "-", "-3", "*", "43.8", "/", "1", "%", "(", "1", "*", "2", ")"])

推荐阅读