首页 > 解决方案 > 如何在忽略较少数量的连续行的同时将每块 3 行合并在一起?

问题描述

我有一个像下面这样的文本文件,包含文本块,块是 3 行的倍数或只有 1 行:

AAAAAAAAAAAAA
BBBBBBBBBBBBB
CCCCCCCCCCCCC
DDDDDDDDDDDDD
EEEEEEEEEEEEE
FFFFFFFFFFFFF

GGGGGGGGGGGGG

HHHHHHHHHHHHH
IIIIIIIIIIIII
JJJJJJJJJJJJJ

KKKKKKKKKKKKK

LLLLLLLLLLLLL
MMMMMMMMMMMMM
NNNNNNNNNNNNN
OOOOOOOOOOOOO
PPPPPPPPPPPPP
QQQQQQQQQQQQQ
RRRRRRRRRRRRR
SSSSSSSSSSSSS
TTTTTTTTTTTTT

UUUUUUUUUUUUU

VVVVVVVVVVVVV
WWWWWWWWWWWWW
XXXXXXXXXXXXX
YYYYYYYYYYYYY
ZZZZZZZZZZZZZ
1111111111111

我想将 3 个连续行的每个块合并在一起,从块中的第一行开始。我想忽略少于 3 行连续行的行。字符和行的长度总是不同的。(我在示例中使线条大小相同,因此看起来不太难看)。

所以输出将是

AAAAAAAAAAAAA BBBBBBBBBBBBB CCCCCCCCCCCCC
DDDDDDDDDDDDD EEEEEEEEEEEEE FFFFFFFFFFFFF

GGGGGGGGGGGGG

HHHHHHHHHHHHH IIIIIIIIIIIII JJJJJJJJJJJJJ

KKKKKKKKKKKKK

LLLLLLLLLLLLL MMMMMMMMMMMMM NNNNNNNNNNNNN
OOOOOOOOOOOOO PPPPPPPPPPPPP QQQQQQQQQQQQQ
RRRRRRRRRRRRR SSSSSSSSSSSSS TTTTTTTTTTTTT

UUUUUUUUUUUUU

VVVVVVVVVVVVV WWWWWWWWWWWWW XXXXXXXXXXXXX
YYYYYYYYYYYYY ZZZZZZZZZZZZZ 1111111111111

我试过用

xargs -n3

但是我不确定如何忽略奇异线

我怎样才能做到这一点?

标签: awk

解决方案


使用 GNU awk gensub()

$ awk -v RS= -v ORS='\n\n' '{$1=$1; print gensub(/(([^ ]+ ){2}[^ ]+) /,"\\1\n","g")}' file
AAAAAAAAAAAAA BBBBBBBBBBBBB CCCCCCCCCCCCC
DDDDDDDDDDDDD EEEEEEEEEEEEE FFFFFFFFFFFFF

GGGGGGGGGGGGG

HHHHHHHHHHHHH IIIIIIIIIIIII JJJJJJJJJJJJJ

KKKKKKKKKKKKK

LLLLLLLLLLLLL MMMMMMMMMMMMM NNNNNNNNNNNNN
OOOOOOOOOOOOO PPPPPPPPPPPPP QQQQQQQQQQQQQ
RRRRRRRRRRRRR SSSSSSSSSSSSS TTTTTTTTTTTTT

UUUUUUUUUUUUU

VVVVVVVVVVVVV WWWWWWWWWWWWW XXXXXXXXXXXXX
YYYYYYYYYYYYY ZZZZZZZZZZZZZ 1111111111111

推荐阅读