python-3.x - awk 未在 csv 文件中打印完整的列值
问题描述
我是 AWK 的新手,需要以下帮助。当错误时,我有下面的代码在 CSV 中打印第 9 列值。第 9 列有 7 行,但它只打印第一行。有人能告诉我如何打印完整的第 9 列值吗?
仅打印“测试失败:文本应等于 /
FILES=$*
for f in $FILES
do
echo "${f##*/}"
echo "------------------------------------------------"
awk -F "," 'BEGIN{print $f} $8 == "false" {print $9}' $f
echo
done
我的输入 CSV:
timeStamp,elapsed,label,responseCode,responseMessage,threadName,dataType,success,failureMessage,bytes,sentBytes,grpThreads,allThreads,Latency,IdleTime,Conne$
1583830716746,1202,HTTP Request- Authorization TC01,200,OK,ZH 1-1,text,true,,530,354,1,1,1202,0,1124
1583830717967,59,ID_001_Wrong_PNR,500,Internal Error,ZH 1-1,text,false,"Test failed: text expected to equal /
****** received : [[[
{
""status"": ""500"",
""code"": ""500"",
...]]]
****** comparison: [[[{""seatReservations"":[{""passengerKey"":""PAX1"",""success"":""false"",""seatCode"":""50C"",""segmentKey"":""SEG1"",""...]]]
/",322,1023,1,1,58,0,0
输出得到:
"Test failed: text expected to equal /
预期输出:
"Test failed: text expected to equal /
****** received : [[[
{
""status"": ""500"",
""code"": ""500"",
...]]]
****** comparison: [[[{""seatReservations"":[{""passengerKey"":""PAX1"",""success"":""false"",""seatCode"":""50C"",""segmentKey"":""SEG1"",""...]]]
/"
解决方案
由于 awk 不是 csv-savvy,如果不为它编写某种 csv 解析器,就不能使用 awk。网上有一些。您还可以针对您的特定问题进行某种破解。像这样(好吧,因为我懒得将它们组合起来)对于 GNU awk(用于使用FPAT
):
$ gawk '
BEGIN {
FPAT = "([^,]*)|(\"[^\"]+\")+" # using FPAT instead of FS, look it up.
OFS=","
} # if record has 16 fields (this is uncertain
NF==16 { # define the condition better to suit data)
$0=$0 "\r\n" # use different newline
}1' file | gawk ' # pipe this to another awk
BEGIN {
FPAT = "([^,]*)|(\"[^\"]+\")+"
RS="\r\n" # that uses \r\n as RS
}
$8=="false" {
print $9
}'
输出:
"Test failed: text expected to equal /
****** received : [[[
{
""status"": ""500"",
""code"": ""500"",
...]]]
****** comparison: [[[{""seatReservations"":[{""passengerKey"":""PAX1"",""success"":""false"",""seatCode"":""50C"",""segmentKey"":""SEG1"",""...]]]
/"
第一个 awk 期望数据具有\n
记录分隔符,对于“整个”记录,它将换行符更改\r\n
为“the
畸形
数据”有\n
。然后第二个 awk\r\n
用于分隔记录。检测“好”和“坏”记录的条件并不充分,需要更好的定义,这只是一个样本,可能会弄乱下一条记录。
这是一个黑客,把它当作一个。破解地球!
推荐阅读
- jenkins - 配置更改后对 In progress Build 的影响
- python - 将 float64 转换为 int(excel 到 pandas)
- sprite - 如何让一个精灵跟随 Java 中的另一个精灵?
- android - 如何在 Firestore 中为 Android 创建本地化数据方案(按语言)?
- mysql - 未找到 XAMPP MYSQL
- python - 为什么 django rest api root 没有列出经典端点
- django - 如何按 Django 中的 ManyToMany 对象的属性进行排序?
- here-api - 关于 HERE 批量地理编码的问题 - HERE 批量地理编码需要多长时间才能完成 - 处于接受状态
- swiftui - 圆圈不居中 - 期待别的东西
- pyspark - HDinsights 4.0-Spark-Hive 集成