linux - Keep first duplicate and replace the rest with blank cell using Awk
问题描述
I have a tsv file where I have 2 columns, with duplicates in the 2nd column. What I would like to do is keep the first duplicate value and replace the rest with blanks. E.g.
Original tsv:
ahah.asd aha
ahsjd.asd aha
asdd.asda aha
ajd.asd aha
asdfk.lo abb
hasd.pou abb
hasd.asd jjj
asidh.09 kkk
asdhs.97 kkk
Expected output:
ahah.asd aha
ahsjd.asd
asdd.asda
ajd.asd
asdfk.lo abb
hasd.pou
hasd.asd jjj
asidh.09 kkk
asdhs.97
In addition to this I would like to add a column that increments until if see a duplicate in column 2. E.g:
ahah.asd aha 1
ahsjd.asd 2
asdd.asda 3
ajd.asd 4
asdfk.lo abb 1
hasd.pou 2
hasd.asd jjj 1
asidh.09 kkk 1
asdhs.97 2
Is this possible? I would like to use awk...
Thanks
解决方案
$ awk 'BEGIN{FS=OFS="\t"} {print $1, (cnt[$2]++ ? "" : $2), cnt[$2]}' file
ahah.asd aha 1
ahsjd.asd 2
asdd.asda 3
ajd.asd 4
asdfk.lo abb 1
hasd.pou 2
hasd.asd jjj 1
asidh.09 kkk 1
asdhs.97 2
推荐阅读
- sas - 计数器变量不打算重置
- python - 使用 MDAnalysis 获得径向分布函数
- javascript - Vue 和 Quasar - 模块 '"../../node_modules/vue/types"' 没有导出的成员 'defineComponent'
- matlab - 通过单击点图中的某些点打开子图(MATLAB)
- java - Android View layoutInflater 通过字符串获取布局
- excel - VBA或公式在Excel中分隔街道名称和后缀?
- python - 为什么在 django 中返回 json 响应后值会改变
- python - 我可以在 kivy 2d gui 中预览终端 bash 输出吗
- r - R 对每个事件的每个主题进行最近的观察
- azure-devops - 在 Azure DevOps 的 Dropbox 中隐藏 Azure Boards