powershell - 使用 Powershell 编辑制表符分隔的文本文件中的列
问题描述
我有一个非常大的(~250k 行和 171 列)制表符分隔的文本文件,我需要编辑。我需要在每一行的第三列添加字母“H”。
所以我需要它从 03/20/2020 09:00 03/20/2020 10:00 1269805 ...... 到 03/20/2020 09:00 03/20/2020 10:00 H1269805 .. ...我实际上使用以下代码进行了此操作:
$source = Get-ChildItem "C:\test\input\*.txt"
$target = "C:\test\test.txt"
$data = Get-Content -Path $source | ConvertFrom-Csv -Delimiter "`t" -Header Column1, Column2, Column3, Column4, Column5, Column6, Column7, Column8, Column9, Column10, Column11, Column12, Column13, Column14, Column15, Column16, Column17, Column18, Column19, Column20,
Column21, Column22, Column23, Column24, Column25, Column26, Column27, Column28, Column29, Column30, Column31, Column32, Column33, Column34, Column35, Column36, Column37, Column38, Column39, Column40,
Column41, Column42, Column43, Column44, Column45, Column46, Column47, Column48, Column49, Column50, Column51, Column52, Column53, Column54, Column55, Column56, Column57, Column58, Column59, Column60,
Column61, Column62, Column63, Column64, Column65, Column66, Column67, Column68, Column69, Column70, Column71, Column72, Column73, Column74, Column75, Column76, Column77, Column78, Column79, Column80,
Column81, Column82, Column83, Column84, Column85, Column86, Column87, Column88, Column89, Column90, Column91, Column92, Column93, Column94, Column95, Column96, Column97, Column98, Column99, Column100,
Column101, Column102, Column103, Column104, Column105, Column106, Column107, Column108, Column109, Column110, Column111, Column112, Column113, Column114, Column115, Column116, Column117, Column118, Column119, Column120,
Column121, Column122, Column123, Column124, Column125, Column126, Column127, Column128, Column129, Column130, Column131, Column132, Column133, Column134, Column135, Column136, Column137, Column138, Column139, Column140,
Column141, Column142, Column143, Column144, Column145, Column146, Column147, Column148, Column149, Column150, Column151, Column152, Column153, Column154, Column155, Column156, Column157, Column158, Column159, Column160,
Column161, Column162, Column163, Column164, Column165, Column166, Column167, Column168, Column169, Column170, Column171
$data | % {
If ($_.Column3) {
#import ID
$_.Column3 = "H$($_.Column3)"
} }
$data | Select Column1, Column2, Column3, Column4, Column5, Column6, Column7, Column8, Column9, Column10, Column11, Column12, Column13, Column14, Column15, Column16, Column17, Column18, Column19, Column20,
Column21, Column22, Column23, Column24, Column25, Column26, Column27, Column28, Column29, Column30, Column31, Column32, Column33, Column34, Column35, Column36, Column37, Column38, Column39, Column40,
Column41, Column42, Column43, Column44, Column45, Column46, Column47, Column48, Column49, Column50, Column51, Column52, Column53, Column54, Column55, Column56, Column57, Column58, Column59, Column60,
Column61, Column62, Column63, Column64, Column65, Column66, Column67, Column68, Column69, Column70, Column71, Column72, Column73, Column74, Column75, Column76, Column77, Column78, Column79, Column80,
Column81, Column82, Column83, Column84, Column85, Column86, Column87, Column88, Column89, Column90, Column91, Column92, Column93, Column94, Column95, Column96, Column97, Column98, Column99, Column100,
Column101, Column102, Column103, Column104, Column105, Column106, Column107, Column108, Column109, Column110, Column111, Column112, Column113, Column114, Column115, Column116, Column117, Column118, Column119, Column120,
Column121, Column122, Column123, Column124, Column125, Column126, Column127, Column128, Column129, Column130, Column131, Column132, Column133, Column134, Column135, Column136, Column137, Column138, Column139, Column140,
Column141, Column142, Column143, Column144, Column145, Column146, Column147, Column148, Column149, Column150, Column151, Column152, Column153, Column154, Column155, Column156, Column157, Column158, Column159, Column160,
Column161, Column162, Column163, Column164, Column165, Column166, Column167, Column168, Column169, Column170, Column171 | ConvertTo-Csv -Delimiter "`t" -NoTypeInformation | % { $_ -replace '"', "" } | Select-Object -Skip 1 | Set-Content -Path $target
我的问题是需要很长时间。我知道这是一个大文件,但有没有其他方法可以更快地做到这一点?我觉得与 CSV 之间的转换是最耗时的,但我可能错了。整个过程大约需要 25 分钟才能完成。任何帮助都会很棒。
解决方案
为了加快处理速度,避免使用管道,对文件 I/O 使用 .NET 类型并使用纯文本操作:
# Create the output file.
$outFile = [IO.File]::CreateText($target)
# Loop over all input files
foreach ($file in Get-ChildItem C:\test\input\*.txt) {
# Loop over a given file's lines.
foreach ($line in [IO.File]::ReadLines($file.FullName)) {
# Prepend 'H' to the 3rd column and append to the output file.
$outFile.WriteLine(($line -replace '^.*?\t.*?\t', '$&H'))
}
}
$outFile.Close()
笔记:
推荐阅读
- python - 如何使用 2.0 API 将 TensorFlow 模型 2.0 转换为 tslite 模型?
- python - 将scrapy日志写入文件
- javascript - 删除或忽略 .getRange() 返回的空单元格
- javascript - 键控集合的流类型检查,例如 new Set()
- android - 使用 Retrofit 发布音频文件
- javascript - 如何使用邮递员脚本验证文件的存在?
- android - 为什么 withContext 后面的行被阻塞了?
- java - TestNG - 扩展类时 @Factory 和 @DataProvider 未运行
- android - 使用排除的 MIME 类型创建意图?
- laravel - 我可以将表单选择输入的值传递给验证器,但我似乎无法将数据传递到数据库中