首页 > 解决方案 > awk 将变量添加到列字符串的一部分

问题描述

客观的

将“67”添加到输出文件的第 1 列,其中 67 是根据 2 个日期之间的差异分类的变量 ($iv)。

文件 1.csv

display,dc,client,20572431,5383594
display,dc,client,20589101,4932821
display,dc,client,23030494,4795549
display,dc,client,22973424,5844194
display,dc,client,21489000,4251031
display,dc,client,23150347,3123945
display,dc,client,23194965,2503875
display,dc,client,20578983,1522448
display,dc,client,22243554,920166
display,dc,client,20572149,118865
display,dc,client,23077785,28077
display,dc,client,21811100,5439

当前输出 3_file1.csv

BOB-UK-,display,dc,client,20572431,5383594,0.05,269.18
BOB-UK-,display,dc,client,20589101,4932821,0.05,246.641
BOB-UK-,display,dc,client,23030494,4795549,0.05,239.777
BOB-UK-,display,dc,client,22973424,5844194,0.05,292.21
BOB-UK-,display,dc,client,21489000,4251031,0.05,212.552
BOB-UK-,display,dc,client,23150347,3123945,0.05,156.197
BOB-UK-,display,dc,client,23194965,2503875,0.05,125.194
BOB-UK-,display,dc,client,20578983,1522448,0.05,76.1224
BOB-UK-,display,dc,client,22243554,920166,0.05,46.0083
BOB-UK-,display,dc,client,20572149,118865,0.05,5.94325
BOB-UK-,display,dc,client,23077785,28077,0.05,1.40385
BOB-UK-,display,dc,client,21811100,5439,0.05,0.27195
TOTAL,,,,,33430004,,1671.5

所需输出 3_file1.csv

BOB-UK-67,display,dc,client,20572431,5383594,0.05,269.18
BOB-UK-67,display,dc,client,20589101,4932821,0.05,246.641
BOB-UK-67,display,dc,client,23030494,4795549,0.05,239.777
BOB-UK-67,display,dc,client,22973424,5844194,0.05,292.21
BOB-UK-67,display,dc,client,21489000,4251031,0.05,212.552
BOB-UK-67,display,dc,client,23150347,3123945,0.05,156.197
BOB-UK-67,display,dc,client,23194965,2503875,0.05,125.194
BOB-UK-67,display,dc,client,20578983,1522448,0.05,76.1224
BOB-UK-67,display,dc,client,22243554,920166,0.05,46.0083
BOB-UK-67,display,dc,client,20572149,118865,0.05,5.94325
BOB-UK-67,display,dc,client,23077785,28077,0.05,1.40385
BOB-UK-67,display,dc,client,21811100,5439,0.05,0.27195
TOTAL,,,,,33430004,,1671.5

当前代码

#! bin/sh

set -eu

de=$(date +"%d-%m-%Y" -d "1 month ago")
ds="15-04-2014"
iv=$(awk -vdate1=$de -vdate2=$ds 'BEGIN{split(date1, A,"-");split(date2, B,"-");year_diff=A[3]-B[3];if(year_diff){months_diff=A[2] + 12 * year_diff - B[2] + 1;} else {months_diff=A[2]>B[2]?A[2]-B[2]+1:B[2]-A[2]+1};print months_diff}')



for f in $(find *.csv); do

    awk -F"," -v OFS=',' '{print "BOB-UK-"$iv,$0,0.05}' $f > "1_$f.csv" ##PROBLEM LINE##
    awk -F"," -v OFS=',' '{print $0,$6*$7/1000}' "1_$f.csv" > "2_$f.csv" ##calculate price
    awk -F"," -v OFS=',' '{print $0}; {sum+=$6}{sum2+=$8} END {print "TOTAL,,,,," (sum)",,"(sum2)}' "2_$f.csv" > "3_$f.csv" ##calculate total

done

问题

当我运行第一条 awk 行(标记为“## PROBLEM LINE##”)时,循环不会更改列 $1 以在“BOB-UK-”之后包含“67”。这应该用 the 来完成,print "BOB-UK-"$iv但它什么也不做。我怀疑这是由于print工作方式造成的,awk但我无法在这一行中找到一种方法来处理它。有谁知道这是否可能,或者我是否需要创建一个新行来实现这一点?

标签: shellcsvawk

解决方案


您必须将变量值传递给awk. awk不会从 shell 继承变量,也不会$variable像 shell 那样扩展变量。它是另一种具有内部语言的工具。

awk -v iv="$iv" -F"," -v OFS=',' '{print "BOB-UK-"iv,$0,0.05}' "$f"

使用提供的输入在repl中测试。

对于 f in $(find *.csv)

是无用的find,没有意义,只是

for f in *.csv

另请注意,您正在循环中的当前目录中创建1_$f.csv,2_$f.csv3_$f.csv文件,因此下次运行脚本时,将有 4 倍的 .csv 文件进行迭代。不知道这是否相关。

如何$iv在 awk 中工作?

是 awk 中该行$<number>的字段编号<number>。例如,这$1是 awk 中行的第一个字段。$2是第二个字段。这$0是特别的,它是整条线。

$iv扩展为+的$iv。例如:

echo a b c | awk '{iv=2; print $iv}'

将输出b,因为$iv扩展为$2然后$2从输入扩展到第二个字段 - 即。b.

中未初始化的变量用awk初始化0。所以在你的行中$iv被替换,所以它扩展到整行。$0awk


推荐阅读