首页 > 解决方案 > 匹配两个文件中的值并替换选定列中的值

问题描述

目的是检查 file1 中第 3 列和第 4 列的值是否与 file2 中的第 1 列匹配。如果任何值匹配,则使用 file1 第 5 列和第 6 列的信息替换 file2 中第 2 列和第 3 列的值

另外,对于匹配的行,我需要将 file1 中的第 7 列和第 8 列的值添加到第 1 列和第 2 列中的 file2 中,将字符 R 用于替换的行,将 O 用于未替换的行,

文件 1

2,100,31431,37131,999991.70,0000000.30,11111,22222,3
3,100,31431,37471,111113.20,1111111.30,22222,33333,4

文件2

3143137113 318512.50 2334387.50 100
3143137131 318737.50 2334387.50 100
3143137201 319612.50 2334387.50 100
3143137471 322987.50 2334387.50 100
3143137491 323237.50 2334387.50 100

期望的输出:

31431,37113,318512.50,2334387.50,100,O
11111,22222,999991.70,0000000.30,100,R
31431,37201,319612.50,2334387.50,100,O
22222,33333,111113.20,1111111.30,100,R
31431,37491,323237.50,2334387.50,100,O

我试过这两个:

1)

awk '
BEGIN{
  OFS=","
}
FNR==NR{
  a[$3 $4]=$3 OFS $4
  b[$3 $4]=$5
  c[$3 $4]=$6
  d[$3 $4]=$7 OFS $8
  next
}
($1 in
 a){
  $4=d[$1]
  $3=c[$1]
  $2=b[$1]
  $1=a[$1]
  print
  next
}
{
  $1=$1
  sub(/^...../,"&,",$1)
  print
}
' FS=","  file1 FS=" "  file2

输出

31431,37113,318512.50,2334387.50,100
31431,37131,999991.70,0000000.30,11111,22222
31431,37201,319612.50,2334387.50,100
31431,37471,111113.20,1111111.30,22222,33333
31431,37491,323237.50,2334387.50,100

2)

awk -F, 'NR==FNR{a[$3 $4]=substr($0,length($3 FS)+1);next} $1 in a{print a[$1],$NF;next} {$1=substr($1,1,5) OFS substr($1,6,5);} 1' OFS=, file1 FS=' ' file2

输出

31431,37113,318512.50,2334387.50,100
31431,37131,999991.70,0000000.30,11111,22222,3,100
31431,37201,319612.50,2334387.50,100
31431,37471,111113.20,1111111.30,22222,33333,4,100
31431,37491,323237.50,2334387.50,100

两者都有效,但不完全。

提前致谢

标签: awk

解决方案


请您尝试以下操作。

awk '
FNR==NR{
  a[$3 $4]=$7 $8
  b[$3 $4]=$5
  c[$3 $4]=$6
  next
}
($1 in a){
  $2=b[$1]
  $3=c[$1]
  $1=a[$1]
  found=1
}
{
  $0=found==1?$0",R":$0",O"
  sub(/^...../,"&,")
  $1=$1
  found=""
}
1
' FS="," file1 FS=" " OFS="," file2

输出如下。

31431,37113,318512.50,2334387.50,100,O
11111,22222,999991.70,0000000.30,100,R
31431,37201,319612.50,2334387.50,100,O
22222,33333,111113.20,1111111.30,100,R
31431,37491,323237.50,2334387.50,100,O

推荐阅读