首页 > 解决方案 > 添加在 2 列中发现时间相同的新列

问题描述

添加新列,其值是第 1 列和第 2 列中的值争用完全相同的值的次数。

输入文件

46849,39785,2,012,023,351912.29,2527104.70,174.31
46849,39785,2,012,028,351912.45,2527118.70,174.30
46849,39785,3,06,018,351912.12,2527119.51,174.33
46849,39785,3,06,020,351911.80,2527105.83,174.40
46849,39797,2,012,023,352062.45,2527118.50,173.99
46849,39797,2,012,028,352062.51,2527105.51,174.04
46849,39797,3,06,020,352063.29,2527116.71,174.13,
46849,39809,2,012,023,352211.63,2527104.81,173.74
46849,39809,2,012,028,352211.21,2527117.94,173.69
46849,39803,2,012,023,352211.63,2527104.81,173.74
46849,39803,2,012,028,352211.21,2527117.94,173.69
46849,39801,2,012,023,352211.63,2527104.81,173.74

预期的输出文件:

4,46849,39785,2,012,023,351912.29,2527104.70,174.31
4,46849,39785,2,012,028,351912.45,2527118.70,174.30
4,46849,39785,3,06,018,351912.12,2527119.51,174.33
4,46849,39785,3,06,020,351911.80,2527105.83,174.40
3,46849,39797,2,012,023,352062.45,2527118.50,173.99
3,46849,39797,2,012,028,352062.51,2527105.51,174.04
3,46849,39797,3,06,020,352063.29,2527116.71,174.13,
2,46849,39809,2,012,023,352211.63,2527104.81,173.74
2,46849,39809,2,012,028,352211.21,2527117.94,173.69
2,46849,39803,2,012,023,352211.63,2527104.81,173.74
1,46849,39803,2,012,028,352211.21,2527117.94,173.69
1,46849,39801,2,012,023,352211.63,2527104.81,173.74

试图:

awk -F, '{x[$1 $2]++}END{ for(i in x) {print i,x[i]}}' file

4684939785 4
4684939797 3
4684939801 1
4684939803 2
4684939809 2

标签: awk

解决方案


请您尝试以下操作。

awk '
BEGIN{
  FS=OFS=","
}
FNR==NR{
  a[$1,$2]++
  next
}
{
  print a[$1,$2],$0
}
' Input_file Input_file

解释:读取 Input_file 2 次。我第一次创建一个名为 a 的数组,其中包含第一个和第二个字段的索引,并在每次出现时计算它们的值。在第二次读取文件时,它会打印前 2 个字段的总数,然后打印同时行。

一个班轮代码:

awk 'BEGIN{FS=OFS=","} FNR==NR{a[$1,$2]++;next} {print a[$1,$2],$0}' Input_file Input_file

推荐阅读