首页 > 解决方案 > 正则表达式在引号之间选择换行符

问题描述

我在 Ruby 中有一个类似于以下内容的字符串:

{
  "a boolean": true,
  "multiline": "
my
multiline
value
",
  "a normal key": "a normal value"
}

我只想匹配子字符串中的换行符:

"
my
multiline
value
",

这样我就可以用转义的换行符替换它们。从长远来看,这里的目的是使 JSON 更易于使用。

标签: regexruby

解决方案


更新- 这些正则表达式按预期工作。
来自@faissaloo -it seemed to fail however on my large JSON
我使用两个正则表达式运行这个大字符串:
PCRE https://regex101.com/r/3jtqea/1
Ruby https://regex101.com/r/1HVCCC/1
它们的工作方式相同,并且没有缺陷。
如果您有任何其他问题,请告诉我。


我认为 Ruby 支持类似 Perl 的结构。
如果是这样,它可以在单个全局查找和替换中完成。
像这样:

编辑 - Ruby 不执行回溯控制动词(*SKIP)(*FAIL)
,因此,要在 Ruby 代码中执行此操作,需要正则表达式更加明确。
因此,对 pcre/perl 正则表达式稍作修改,Ruby 等价物是:

红宝石
查找

(?-m)((?!\A)\G|(?:(?>[^"]*"[^"\r\n]*"[^"]*))*")([^"\r\n]*)\K\r?\n(?=[^"]*")((?:[^"\r\n]*"(?:(?>[^"]*"[^"\r\n]*"))*[^"]*)?)

代替

\\n\3

https://regex101.com/r/BaqjEE/1
https://rextester.com/NVFD38349

解释(但它很复杂)

 (?-m)                                    # Non-multiline mode safety check
 (                                        # (1 start), Prefix. Capture for debug
      (?! \A )                                 # Not BOS
      \G                                       # Test where last match left off

   |                                         # or, 
      (?:                                      # Optionally align to next " ( only used once )
           (?> [^"]* " [^"\r\n]* " [^"]* )
      )*

      "                                        # A new quote to test
 )                                        # (1 end)

 ( [^"\r\n]* )                            # (2), Line break Preamble. Capture for debug
 \K                                       # Exclude from the match (group 0) up to this point

 \r? \n                                   # Line break to escape

 (?= [^"]* " )                            # Validate we have " closure

 (                                        # (3 start), Optional end quote and alignment.
                                               # To be written back.
      (?:
           [^"\r\n]* "                   
           (?:                                      # Optionally align to next "
                (?> [^"]* " [^"\r\n]* " )
           )*
           [^"]* 
      )?
 )                                        # (3 end)


 # Ruby Code:
 #----------------------
 # #ruby 2.3.1 
 # 
 # re = /(?-m)((?!\A)\G|(?:(?>[^"]*"[^"\r\n]*"[^"]*))*")([^"\r\n]*)\K\r?\n(?=[^"]*")((?:[^"\r\n]*"(?:(?>[^"]*"[^"\r\n]*"))*[^"]*)?)/
 # str = '{
 #   "a boolean": true,
 #   "a boolean": true,
 #   "a boolean": true,
 #   "a boolean": true,
 #   "multiline": "
 # my
 # multiline
 # value
 # asdf"
 # ,
 # 
 # "a multiline boo
 # lean": true,
 # "a normal key": "a multiline
 # 
 # value"
 # }'
 # subst = '\\n\3'
 # 
 # result = str.gsub(re, subst)
 # 
 # # Print the result of the substitution
 # puts result

对于 Pcre/Perl
查找

(?:((?:(?>[^"]*"[^"\n]*"[^"]*))+(*SKIP)(*FAIL)|"|(?!^)\G)([^"\n]*)\K\n(?=[^"]*")((?:[^"\n]*")?))

代替

\\n$3

https://regex101.com/r/06naae/1

解释(但它很复杂)
请注意,如果您在编辑器需要 CRLF 中断的 Windows 框中,请在 LF 前面
添加一个,如下所示。\r\r\n

 (?:
      (                             # (1 start), Prefix capture, for debug
           (?:
                (?> [^"]* " [^"\n]* " [^"]* )
           )+
           (*SKIP) (*FAIL)               # Consume false positives, but ignore them
                                         # (need this to align next ")
        |                              # or,
           "                             # A new quote to test
        |                              # or, 
           (?! ^ )                       # Not BOS
           \G                            # Test where last match left off
      )                             # (1 end)

      ( [^"\n]* )                   # (2), Preamble capture, for debug
      \K                            # Exclude from the match (group 0) up to this point
      \n                            # Line break to escape
      (?= [^"]* " )                 # Validate we have " closure
      (                             # (3 start), End quote, to be written back
           (?: [^"\n]* " )?
      )                             # (3 end)
 )

推荐阅读