首页 > 解决方案 > Sed 只识别部分搜索模式

问题描述

我正在寻找一种方法来用常量替换文件中所有出现的网站。我gsed在我的 mac 上使用和正则表达式(不要偏离 mac 一词,因为这与我在 Windows 机器上执行时得到的输出相同)来完成此操作。我能够成功验证 regex101.com 上的正则表达式,但由于某种原因 sed 替换失败

gsed --version : gsed (GNU sed) 4.8

(g)sed命令:

find . -type f -path "./file1.txt"  -exec gsed -i -E -f /tmp/scripts/regex {} \;

/tmp/scripts/regex内容:

s/(ftp|http[s]?):\/\/([\w\.-]+)/\1{Your_Site}/gI

样本file1.txt内容:

* "{\n \"firstName\": \"\",\n \"lastName\": \"\",\n \"street1\": \"\",\n \"street2\": \"\",\n \"city\": \"\",\n \"state\": \"\",\n \"postalCode\": \"\",\n \"country\": \"\",\n \"domain\": \"http://example.org\",\n \"action\": \"addUser\",\n \"token\": \"\",\n \"transId\": \"1413290890.usr.209883490\",\n \"customerId\": \"145qjk345kl_908jkl.345\",\n  \"src_name\": \"Your_Application\",\n \"channel\": \"webpage\",\n \"accountId\": \"0097892hjke6987hiuw.ACNT.hsapou8972rjk\",\n \"system\": \"Your_System\",\n \"originatingSystem_code\": \"Your_System_Id\",\n \"purchase_currency\": \"USD\",\n \"url\": \"https://another-link-to-my-example.org/add-user/new\",\n \"createFlag\": \"on\",\n \"web_version\": \"7\",\n

预期输出:

* "{\n \"firstName\": \"\",\n \"lastName\": \"\",\n \"street1\": \"\",\n \"street2\": \"\",\n \"city\": \"\",\n \"state\": \"\",\n \"postalCode\": \"\",\n \"country\": \"\",\n \"domain\": \"http://{Your_Site}\",\n \"action\": \"addUser\",\n \"token\": \"\",\n \"transId\": \"1413290890.usr.209883490\",\n \"customerId\": \"145qjk345kl_908jkl.345\",\n  \"src_name\": \"Your_Application\",\n \"channel\": \"webpage\",\n \"accountId\": \"0097892hjke6987hiuw.ACNT.hsapou8972rjk\",\n \"system\": \"Your_System\",\n \"originatingSystem_code\": \"Your_System_Id\",\n \"purchase_currency\": \"USD\",\n \"url\": \"https://{Your_Site}/add-user/new\",\n \"createFlag\": \"on\",\n \"web_version\": \"7\",\n

电流输出:

* "{\n \"firstName\": \"\",\n \"lastName\": \"\",\n \"street1\": \"\",\n \"street2\": \"\",\n \"city\": \"\",\n \"state\": \"\",\n \"postalCode\": \"\",\n \"country\": \"\",\n \"domain\": \"http://{Your_Site}xample.org\",\n \"action\": \"addUser\",\n \"token\": \"\",\n \"transId\": \"1413290890.usr.209883490\",\n \"customerId\": \"145qjk345kl_908jkl.345\",\n  \"src_name\": \"Your_Application\",\n \"channel\": \"webpage\",\n \"accountId\": \"0097892hjke6987hiuw.ACNT.hsapou8972rjk\",\n \"system\": \"Your_System\",\n \"originatingSystem_code\": \"Your_System_Id\",\n \"purchase_currency\": \"USD\",\n \"url\": \"https://another-link-to-my-example.org/add-user/new\",\n \"createFlag\": \"on\",\n \"web_version\": \"7\",\n

如果我可能错过了,请询问任何其他信息。

标签: regexsed

解决方案


使用[:alnum:]字符类或类似的而不是\w括号内的表达式。

我强调了以下更改:

s/(ftp|http[s]?):\/\/([\w\.-]+)/\1{Your_Site}/gI
                       ^^

s/(ftp|http[s]?):\/\/([[:alnum:]\.-]+)/\1{Your_Site}/gI
                       ^^^^^^^^^

请注意,此 RE 仍然过于宽松,并且会匹配无效名称,以防万一。


推荐阅读