首页 > 解决方案 > 用于更改具有多个输出的文件中的字母的 awk 命令

问题描述

我有一个如下所示的输入文件:

input.txt

THISISANEXAMPLEOFANINPUTFILEWITHALONGSTRINGOFTEXT

我有另一个文件,其中包含要更改的字母位置和要更改为的字母,例如:

textpos.txt

Position    Text_Change
1           A
2           B
3           X

(实际上会有大约10,000个字母变化)

我希望每个文本更改都有一个单独的输出文件,应该如下所示:

output1.txt

AHISISANEXAMPLEOFANINPUTFILEWITHALONGSTRINGOFTEXT

下一个:

output2.txt

TBISISANEXAMPLEOFANINPUTFILEWITHALONGSTRINGOFTEXT

下一个:

output3.txt

THXSISANEXAMPLEOFANINPUTFILEWITHALONGSTRINGOFTEXT

我想学习如何在 awk 命令和 pythonic 方式中执行此操作,并且想知道执行此操作的最佳和最快方法是什么?

提前致谢。

标签: pythonparsingawk

解决方案


您能否尝试以下操作(考虑到您的实际 Input_files 中将包含相同类型的数据)。这个解决方案应该处理错误Too many open files error while running awk command,因为我正在关闭awk代码中的输出文件。

awk '
FNR==NR{
   a[++count]=$0
   next
}
FNR>1{
   close(file)
   file="output"(FNR-1)".txt"
   for(i=1;i<=count;i++){
      if($1==1){
         print $2 substr(a[i],2) > file
      }
      else{
         print substr(a[i],1,$1-1) $2 substr(a[i],$1+1) > file
      }
   }
}'  input.txt  textpos.txt

3 个名为 的输出文件,output1.txt其内容如下。output2.txtoutput3.txt

cat output1.txt
AHISISANEXAMPLEOFANINPUTFILEWITHALONGSTRINGOFTEXT
cat output2.txt
TBISISANEXAMPLEOFANINPUTFILEWITHALONGSTRINGOFTEXT
cat output3.txt
THXSISANEXAMPLEOFANINPUTFILEWITHALONGSTRINGOFTEXT

说明:在此处添加对上述代码的说明。

awk '
FNR==NR{                                                       ##Condition FNR==NR will be TRUE when first file named input.txt is being read.
   a[++count]=$0                                               ##Creating an array named a whose index is increasing value of count and value is current line.
   next                                                        ##next will skip all further statements from here.
}
FNR>1{                                                         ##This condition will be executed when 2nd Input_file textpos.txt is being read(excluding its header).
   close(file)                                                 ##Closing file named file whose value will be output file names, getting created further.
   file="output"(FNR-1)".txt"                                  ##Creating output file named output FNR-1(line number -1) and .txt in it.
   for(i=1;i<=count;i++){                                      ##Starting a for loop from 1 to till count value.
      if($1==1){                                               ##Checking condition if value of 1st field is 1 then do following.
         print $2 substr(a[i],2) > file                        ##Printing $2 substring of value of a[i] which starts from 2nd position till end of line to output file.
      }
      else{
         print substr(a[i],1,$1-1) $2 substr(a[i],$1+1) > file ##Printing substrings 1st 1 to till value of $1-1 $2 and then substring from $1+1 till end of line.
      }
   }
}'  input.txt  textpos.txt                                     ##Mentioning Input_file names here.

推荐阅读