首页 > 解决方案 > 当文本块中必须存在特定字符串时,如何使用 sed/awk 在两个模式之间提取文本

问题描述

我已经找到了几个关于如何在两种模式之间进行 sed/awk 的答案,但我还需要只找到其中​​包含字符串的特定文本块!

文本示例:

<requirement        id = "blabla.1"
                slogan = "Handling of blabla"
          work-package = "bla444.2"
          logical-node = "BLA-C"
                 level = "System"
>
Bla bla.
</requirement>
<requirement        id = "bla.2"
                slogan = "Reporting of blabla"
          work-package = "bla444.1"
          logical-node = "BLA-C"
                 level = "System"
>
Bla bla bla.
</requirement>

所以目标是只获取 & 之间的文本块,它应该在工作包中有 bla444.1 !这应该在示例中只给我最后一个文本块。当然,我想要 sed 的文件有更多的要求,并且有几个需要工作包,所以不仅仅是 sed 会找到的最后一个文本块。

sed -e 's/<requirement\(.*\)<\/requirement/\1/' file

上面的 sed 行将给出所有文本块(要求)。

一件事是文本块没有固定的行数,但都有工作包!

标签: awksed

解决方案


请您尝试以下操作。

awk '
/^<requirement/{
  if(found && value){
    print value
  }
  found=value=""
}
{
  value=(value?value ORS:"")$0
}
/work-package.*bla444.1\"$/{
  found=1
}
END{
  if(found && value){
    print value
  }
}
'  Input_file

说明:为上述代码添加详细说明。

awk '                           ##Starting awk program from here.
/^<requirement/{                ##Checking condition if line starts from string <requirement then do following.
  if(found && value){           ##Checking condition if found and value is NOT NULL then do following.
    print value                 ##Printing value(which contains all blocks value, explained further code) here.
  }
  found=value=""                ##Nullifying variables found and value variables here.
}
{
  value=(value?value ORS:"")$0  ##Creating variable value whose value is keep concatenating its own value each time cursor comes here.
}
/work-package.*bla444.1\"$/{    ##Checking condition if a line has string work-package till bla444.1 then do following.
  found=1                       ##Making variable found and setting value to 1, kind of FLAG enabling stuff.
}
END{                            ##Starting END block of this awk code here.
  if(found && value){           ##Checking condition if found and value is NOT NULL then do following.
    print value                 ##Printing value variable here.
  }
}
'  Input_file                   ##Mentioning Input_file name here.

推荐阅读