linux - 将多个 CSV 文件合并到一个 CSV 文件中,并使用 unix shell 脚本或 unix awk 在最终的 CSV 文件中创建超级模式
问题描述
我想使用 UNIX shell 脚本将多个 CSV 文件合并为一个 CSV 文件。最终的 CSV 文件包含超级模式列(每个文件中的所有列)。不可用的列值考虑最终 CSV 文件中的空值或空格值。
例子:
file111.csv:
"E_TYPE","TIMESTAMP","EXEC_TIME","DBT_TIME","CALLOUT_TIME","CLIENT_IP"
"BBCout","20191011000022.423","95","0","2019-01-11T00:00:05.300Z","200.50.000.333"
"BBCout","20200403122024.123","96","1","2020-04-03T00:00:05.300Z","300.50.000.333"
"BBCout","20210102083426.543","92","0","2021-01-02T00:00:05.300Z","400.50.000.333"
file222.csv:
"E_TYPE","TIMESTAMP","TYPE","METHOD","TIME","RT_SIZE","URL","UID_DERIVED","CLIENT_IP"
"AACallout","20210215000030.815","REST","POST","61","71","""https://st.aaa.xxx.net/n1/yyy/zzz""","0055QAQ","200.50.000.333"
"AACallout","20201210000012.800","REST","GET","67","75","""https://st.aaa.xxx.net/n1/yyy/zzz""","0055BBBQ","300.00.000.111"
最终合并的 CSV 应包含所有列,非可用列应为空值或空格。
final CSV file:
"E_TYPE","TIMESTAMP","CLIENT_IP","EXEC_TIME","DBT_TIME","CALLOUT_TIME","TYPE","METHOD","TIME","RT_SIZE","URL","UID_DERIVED"
"BBCout","20191011000022.423","200.50.000.333","95","0","2019-01-11T00:00:05.300Z",,,,,,
"BBCout","20200403122024.123","300.50.000.333","96","1","2020-04-03T00:00:05.300Z",,,,,,
"BBCout","20210102083426.543","400.50.000.333","92","0","2021-01-02T00:00:05.300Z",,,,,,
"AACallout","20210215000030.815","200.50.000.333",,,,"REST","POST","61","71","""https://st.aaa.xxx.net/n1/yyy/zzz""","0055QAQ"
"AACallout","20201210000012.800","300.00.000.111",,,,"REST","GET","67","75","""https://st.aaa.xxx.net/n1/yyy/zzz""","0055BBBQ"
解决方案
推荐阅读
- google-api - 当我向 googleapis 发出请求时,收到 400 Bad Request
- git - 纱线 v2 gitignore
- c++ - 内存栅栏是否涉及内核
- python - “min(X(X>0), Y, Z)”在 Python 中是什么意思?
- sockets - 如何在 arm Linux 目标上禁用 IPv4 并仅启用 IPv6?
- performance - 我可以使用 Prometheus 监控多个服务器吗?
- antlr - 在 ANTLR 中的所有情况下都可以消除左递归吗?
- laravel - LARAVEL - 输入错误 - 此路由不支持 POST 方法。支持的方法:GET、HEAD
- java - 搜索嵌套 jsonb 列对象的规范谓词
- java - 使用带有 Gson 的 mapValue() 方法的 Kafka-streams 应用程序错误