首页 > 解决方案 > 带有html的列上的MySql正则表达式

问题描述

我有一个包含 HTML 内容的表格,我需要在其中选择包含带有 CSS 类“glyphicon glyphicon-film”和一些文本内容的 span 元素的所有行,但排除所有没有内容或仅包含空格的行。

这些行不应匹配:

<span class="glyphicon glyphicon-film">&nbsp;</span>
<span class="glyphicon glyphicon-film"></span>
<span class="glyphicon glyphicon-film"> </span>
<span class="hypo-link-target" data-link-target-id="1379">Hammartå</span>

这些行应该匹配:

<span class="glyphicon glyphicon-film">nån text</span>
<span class="glyphicon glyphicon-film">Nån Text</span>
<span class="glyphicon glyphicon-film">NÅN TEXT</span>
<span class="glyphicon glyphicon-film">&nbsp;nån text</span>
<span class="glyphicon glyphicon-film">&nbsp;Nån Text</span>
<span class="glyphicon glyphicon-film">&nbsp; nån text</span>
<span class="glyphicon glyphicon-film">&nbsp; Nån Text</span>
<span class="glyphicon glyphicon-film"> nån text</span>
<span class="glyphicon glyphicon-film"> Nån Text</span>

我目前的查询:

select * from Section 
where (Text regexp '(span class="glyphicon glyphicon-film">&nbsp;.+</span>)' 
or Text regexp 'span class="glyphicon glyphicon-film">[a-zA-Z][|å|ä|ö|Å|Ä|Ö]+</span>');

我应该使用哪个正则表达式来获得所需的结果?

更新: 根据答案中的建议,我最终得到了这个我认为适用于所有情况的查询。此查询还将匹配具有两个或多个 span 元素的行,其中一些元素正确而另一些元素不正确,例如:

<span class="glyphicon glyphicon-film">&nbsp;Nån Text</span>more content here... and a correct span <span class="glyphicon glyphicon-film">&nbsp;</span>

select * from Section 
where Text regexp 'span class="glyphicon glyphicon-film">&nbsp;.+</span>' 
        or (Text regexp 'span class="glyphicon glyphicon-film">[a-zA-ZåäöÅÄÖ ]+</span>'
and Text not regexp 'span class="glyphicon glyphicon-film"> </span>');

标签: mysqlsqlregex

解决方案


该模式[a-zA-Z][|å|ä|ö|Å|Ä|Ö]匹配以 ASCII 字母开头的字符串,后跟|åäöÅÄÖ. 您想要匹配这些字符的任意组合的字符串,以及您遗漏的空格。那将是[a-zA-ZåäöÅÄÖ ]+

|内部不用于指定替代字符[],因为[]它已经指定了替代字符。

select * from Section 
where (Text regexp 'span class="glyphicon glyphicon-film">&nbsp;.+</span>' 
        or Text regexp 'span class="glyphicon glyphicon-film">[a-zA-ZåäöÅÄÖ ]+</span>')
and Text not regexp 'span class="glyphicon glyphicon-film"> +</span>';

需要最后一个条件来防止<span class="glyphicon glyphicon-film"> </span>匹配。

演示


推荐阅读