首页 > 解决方案 > 如何从字符串中删除特殊字符和多个空格

问题描述

我想从字符串的开头和结尾删除所有特殊字符(包括空格),并用一个替换连续的空格。例如,

"      !:;:§"   this string is normal.   "§$"§"$"§$    $"$§"     "

应该变成:

"this string is normal"

我想允许!?在字符串的末尾。

"      !:;:§"   this  string is normal?   "§$"§"$"§$    $"$§"      "
"      !:;:§"   this string    is very normal!   "§$"§"$"§$    $"$§"      "
"      !:;:§"   this string is     very normal!?   "§$"§"$"§$    $"$§"      "

应该变成:

"this string is normal?"
"this string is normal!"
"this string is normal!?"

这一切都是为了在应用程序中获得漂亮的标题。

有人能帮助我吗?或者有人知道一个好的正则表达式命令吗?

标签: ruby

解决方案


R = /
    (?:           # begin a non-capture group
      \p{Alnum}+  # match one or more alphanumeric characters
      [ ]+        # match one or more spaces
    )*            # end non-capture group and execute zero or more times
    \p{Alnum}+    # match one or more alphanumeric characters
    [!?]*         # match zero or more characters '!' and '?'
    /x            # free-spacing regex definition mode

def extract(str)
  str[R].squeeze(' ')
end

arr = [
  '      !:;:§"   this  string is normal?   "§$"§"$"§$    $"$§"      ',
  '      !:;:§"   this string    is very normal!   "§$"§"$"§$    $"$§"      ',
  '      !:;:§"   this string is     very normal!?   "§$"§"$"§$    $"$§"      ',
  '      !:;:§"   cette  chaîne  est normale?   "§$"§"$"§$    $"$§"    '
]
arr.each { |s| puts extract(s) }

印刷

this string is normal?
this string is very normal!
this string is very normal!?
cette chaîne est normale?

请参阅Regexp\p{Alnum}中的文档(搜索“\p{} 构造”)。

为了记录每个步骤,我以自由间距模式编写了正则表达式。它通常会写成如下。

/(?:\p{Alnum}+ +)*\p{Alnum}+[!?]*/

请注意,在自由间距模式下,我在字符类中放置了一个空格。如果我没有这样做,那么在评估正则表达式之前就会删除空格。

如果字符串内部允许使用除空格以外的非字母数字字符,请将正则表达式更改为以下内容。

def extract(str)
  str.gsub(R,'')
end

R = /
    \A              # match the beginning of the string
    [^\p{Alnum}]+   # match one non-alphanumeric characters
    |               # or
    [^\p{Alnum}!?]  # match a character other than a alphanumeric, '!' and '?'
    [^\p{Alnum}]+   # match one non-alphanumeric characters
    \z              # match the end of the string
    |               # or
    [ ]             # match a space...
    (?=[ ])         # ...followed by a space
    /x              # free-spacing regex definition mode

extract '  !:;:§"   this  string $$ is abnormal?   "§$"  $"$§"  '

印刷

"this string $$ is abnormal?"

推荐阅读