首页 > 解决方案 > 正则表达式匹配某些 xml 标签

问题描述

考虑以下字符串

<Document>
<name>title</name>
<visibility>1</visibility>
<Style id="KMLStyler">
<IconStyle>

<Placemark id="kml_2">
<name>kml_2</name>
<snippet> </snippet>
<description>
.....
<Placemark id="kml_4">
<name>kml_4</name>
<snippet> </snippet>
<description><![CDATA[<center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr>

我想匹配之间的所有内容,<name><\name>除非它后面没有标签<snippet>

我有这个正则表达式: <name>[\s\S]*?<\/name>[\r\n]<snippet> <\/snippet>

它几乎可以工作,但它也匹配第一个<name>标签:<name>title</name>后面没有<snippet>.

如何制作只匹配所有其他名称标签的正则表达式?

标签: regex

解决方案


利用

<name>(?:(?!<name>)[\s\S])*?<\/name>[\r\n]<snippet> <\/snippet>

证明

解释

--------------------------------------------------------------------------------
  <name>                   '<name>'
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (0 or more times
                           (matching the least amount possible)):
--------------------------------------------------------------------------------
    (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
      <name>                   '<name>'
--------------------------------------------------------------------------------
    )                        end of look-ahead
--------------------------------------------------------------------------------
    [\s\S]                   any character of: whitespace (\n, \r,
                             \t, \f, and " "), non-whitespace (all
                             but \n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
  )*?                      end of grouping
--------------------------------------------------------------------------------
  <                        '<'
--------------------------------------------------------------------------------
  \/                       '/'
--------------------------------------------------------------------------------
  name>                    'name>'
--------------------------------------------------------------------------------
  [\r\n]                   any character of: '\r' (carriage return),
                           '\n' (newline)
--------------------------------------------------------------------------------
  <snippet> <              '<snippet> <'
--------------------------------------------------------------------------------
  \/                       '/'
--------------------------------------------------------------------------------
  snippet>                 'snippet>'

推荐阅读