首页 > 解决方案 > 用两个文件 awk 上的条件替换

问题描述

使用这些示例:

文件1:

      rs12124819     1        0.020242          776546 A G
      rs28765502     1        0.022137          832918 T C
       rs7419119     1        0.022518          842013 T G
        rs950122     1        0.022720          846864 G C

文件2:

1_752566    1   0   752566  G   A
1_776546    1   0   776546  A   G
1_832918    1   0   832918  T   C
1_842013    1   0   842013  T   G

如果它们的第 4 列相等,我正在尝试将 file2 的第 1 列更改为 file1 的相应第 1 列。

预期输出:

rs12124819  1   0   752566  G   A
rs28765502  1   0   776546  A   G
rs7419119   1   0   832918  T   C
rs950122    1   0   842013  T   G

我试图创建 2 个数组,但找不到正确的使用方法:

awk 'FNR==NR{a[$4],b[$1];next} ($4) in a{$1=b[FNR]}1' file1 file2  > out.txt 

非常感谢!

标签: awk

解决方案


使用您显示的示例,您能否尝试以下操作。用 GNU 编写和测试awk

awk 'FNR==NR{a[$4]=$1;next} ($4 in a){$1=a[$4]} 1' file1 file2

说明:为上述添加详细说明。

awk '            ##Starting awk program from here.
FNR==NR{         ##Checking condition if FNR==NR which will be TRUE when file1 is being read.
  a[$4]=$1       ##Creating array a whose index is $4 and value is $1.
  next           ##next will skip all further statements from here.
}
($4 in a){       ##Checking condition if 4th field is present in a then do following.
  $1=a[$4]       ##Setting value of 1st field of file2 as array a value with index of 4th column
}
1                ##1 will print edited/non-edited line.
' file1 file2    ##mentioning Input_file names here.

推荐阅读