首页 > 解决方案 > 使用 awk 填充缺失的日期

问题描述

我的文件中缺少一些日期。例如

$cat ifile.txt

20060805
20060807
20060808
20060809
20060810
20060813
20060815
20060829
20060901
20060903
20060904
20060905
20070712
20070713
20070716
20070717

日期的格式为 YYYYMMDD。我的意图是在日期之间填写缺失的日期,如果它们最多缺少 5 天,例如

20060805
20060806   ---- This was missed
20060807
20060808
20060809
20060810
20060811  ----- This was missed
20060812  ----- This was missed
20060813
20060814  ----- This was missed
20060815  
20060829
20060830 ------ This was missed
20060831 ------ This was missed
20060901  
20060902 ------ This was missed
20060903
20060904
20060905
20070712
20070713
20070714 ----- This was missed
20070715 ----- This was missed
20070716
20070717

如果间隔超过 5 天,则不需要其他日期。例如,我不需要填写 20060815 和 20060829 之间的日期,因为它们之间的间隔超过 5 天。

我正在按照以下方式进行操作,但没有得到任何东西。

#!/bin/sh
awk BEGIN'{
          a[NR]=$1
          } {
          for(i=1; i<NR; i++)
          if ((a[NR+1]-a[NR]) <= 5)
             for (j=1; j<(a[NR+1]-a[NR]); j++)
             print a[j]
          }' ifile.txt

期望的输出:

20060805
20060806 
20060807
20060808
20060809
20060810
20060811 
20060812 
20060813
20060814 
20060815  
20060829
20060830 
20060831 
20060901  
20060902 
20060903
20060904
20060905
20070712
20070713
20070714 
20070715 
20070716
20070717

标签: bashawkdifferencedate-difference

解决方案


您能否尝试在 GNU 中使用显示的示例进行跟踪、编写和测试awk

awk '
FNR==1{
  print
  prev=mktime(substr($0,1,4)" "substr($0,5,2)" "substr($0,7,2) " 00 00 00")
  next
}
{
  found=i=diff=""
  curr_time=mktime(substr($0,1,4)" "substr($0,5,2)" "substr($0,7,2) " 00 00 00")
  diff=(curr_time-prev)/86400
  if(diff>1){
    while(++i<=diff){ print strftime("%Y%m%d", prev+86400*i) }
    found=1
  }
  prev=mktime(substr($0,1,4)" "substr($0,5,2)" "substr($0,7,2) " 00 00 00")
}
!found
'  Input_file

推荐阅读