首页 > 解决方案 > 在linux中打印任何单词以相同字母开头和结尾的行

问题描述

我有输入

sie%Qu7s Kuux"oh9 ohc9ahG% hoe8Toh: Eix*ohd1 doh:bo2U Cu0doo|t zo`L9xaW
fie5Du[h Phe8aid# Opu&fai5 ieZ<aek6 hu4ga&Di Oose}p1p aiD@oos2 nu-a1Fub
ahqu5To/ ahtie[H3 ioK&u5Ai nei1Za#d poo_Th9r gu|aGh7h uZ%io2ah IeNah&v7
eif\e8AE Ieb,ing4 reph1oW* eeSh'ee8 Ah+ei4ai Oi0Ca,vu Esh1xe?e Wei&k4ic
ue5OhQu. aaf-i8uP eedae%T5 sei?M9Pu ieH[oh2l ieh~ah8A aev"oo9A Ohf"i8de
Foh:x2zi aLoo'qu2 Ia6aig-e La{vie1E IeFoh{cI Au_h7Hee Se)f4ebi Cah$yu7m

其中列中的每个单词都构成密码 ، 我正在尝试打印任何单词以相同字母开头和结尾的行,因此我们不区分大小写字母

我知道用命令 grep 我可以做到这一点

cat passwords.txt | grep -e ' \([A-Z]\)......\1 ' -e ' \([a-z]\)......\1 '

但在这里,这个词只能以相同的后者(大写或小写字母)开始和结束,比如

Foh:x2zi aLoo'qu2 Ia6aig-e La{vie1E IeFoh{cI Au_h7Hee Se)f4ebi Cah$yu7m

预期产出

    eif\e8AE Ieb,ing4 reph1oW* eeSh'ee8 Ah+ei4ai Oi0Ca,vu Esh1xe?e Wei&k4ic
    sie%Qu7s Kuux"oh9 ohc9ahG% hoe8Toh: Eix*ohd1 doh:bo2U Cu0doo|t zo`L9xaW
    ue5OhQu. aaf-i8uP eedae%T5 sei?M9Pu ieH[oh2l ieh~ah8A aev"oo9A Ohf"i8de
    Foh:x2zi aLoo'qu2 Ia6aig-e La{vie1E IeFoh{cI Au_h7Hee Se)f4ebi Cah$yu7m
    ahqu5To/ ahtie[H3 ioK&u5Ai nei1Za#d poo_Th9r gu|aGh7h uZ%io2ah IeNah&v7

标签: regexgrep

解决方案


使用 GNU grep:

grep -i -P '(?<!\S)(\S)(?:\S*\1)?(?!\S)' passwords.txt

-i选项打开不区分大小写,-P打开 PCRE 风格(支持后向/前瞻)。

请参阅正则表达式证明

解释

--------------------------------------------------------------------------------
  (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
    \S                       non-whitespace (all but \n, \r, \t, \f,
                             and " ")
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    \S                       non-whitespace (all but \n, \r, \t, \f,
                             and " ")
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    \S*                      non-whitespace (all but \n, \r, \t, \f,
                             and " ") (0 or more times (matching the
                             most amount possible))
--------------------------------------------------------------------------------
    \1                       what was matched by capture \1
--------------------------------------------------------------------------------
  )?                       end of grouping
--------------------------------------------------------------------------------
  (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
    \S                       non-whitespace (all but \n, \r, \t, \f,
                             and " ")
--------------------------------------------------------------------------------
  )                        end of look-ahead

推荐阅读