regex - PowerShell - 在同一行查找和替换多个模式并将对应关系存储在单独的文件中
问题描述
我有一个包含数千行的非常大的文件,其中一些行非常非常长,包含各种数据。我需要在该文件中查找并替换多个字符串,并且要替换的几个字符串可以在同一行。同时,替换值应在每次出现时递增。在单独的文件 $tmp 中,我只需要保留“原始”值的“唯一”对和相应的“替换”值,以防需要恢复原始值。在 Doug Maurer 的大力帮助下,我找到了下面的脚本,它完成了大部分工作,但我仍然不知道如何替换同一行上的第二、第三等字符串以及如何只保留“唯一”对. 有任何想法吗?
输入:
<requestId>qwerty-qwer12-qwer56</requestId>something here.,. reportId>plmkjh8765FGH4rt6As</msg:reportId
<requestId>zxcvbn-zxcv12-zxcv56</requestId>
<requestId>qwerty-qwer12-qwer56</requestId>something else.,.reportId>poGd56Hnm9q3Dfer6Jh</msg:reportId>
期望的输出:
<requestId>RequestId-1</requestId>something here.,. reportId>Report-1</msg:reportId
<requestId>RequestId-2</requestId>
<requestId>RequestId-1</requestId>something else.,.reportId>Report-2</msg:reportId
$tmp 的期望输出:
qwerty-qwer12-qwer56 : RequestId-1
plmkjh8765FGH4rt6As : Report-1
zxcvbn-zxcv12-zxcv56 : RequestId-2
poGd56Hnm9q3Dfer6Jh : Report-2
$tmp = ".\tmp.txt"
@'
Order: Q2we45-Uj87f6-gh65De
reportId>plmkjh8765FGH4rt6As</msg:reportId>
<requestId>qwerty-qwer12-qwer56</requestId>Ace of Base Order: Q2we45-Uj87f6-gh65De<something else...
<requestId>zxcvbn-zxcv12-zxcv56</requestId>
<requestId>1234qw-12qw12-123456</requestId>kljsldjslddsdfdsdsdfff <messageId>1234qw-12qw12-123456</msg
<requestId>1234qw-12qw12-123456</requestId>something here.,. reportId>plmkjh8765FGH4rt6As</msg:reportId
<requestId>1234qw-12qw12-123456</requestId>something else.,.reportId>poGd56Hnm9q3Dfer6Jh</msg:reportId> uraaa 123 <keyID>poU6Ghk89edfTG78Jk45GrRt23HzW4pl</msgdc
<requestId>zxcvbn-zxcv12-zxcv56</requestId>
<requestId>1234qw-12qw12-123456</requestId> abcdef ole ole Order: zxcvbn-zxcv12-zxcv56 abracadabra <keyID>poU6Ghk89edfTG78Jk45GrRt23HzW4pl</msgdc
reportId>plmkjh8765FGH4rt6As</msg:reportId>
<requestId>1234qw-12qw12-12qw56</requestId>
keyId>Qwd84lPhjutf7Nmwr56hJndcsjy34imNQwd84lPhjutZ7Nmwr56hJndcsjy34imNPozDr5</
keyId>Qwd84lPhjutf7Nmwr56hJndcsjy34imNQwd84lPhjutZ7Nmwr56hJndcsjy34imNPozDr5</
keyId>Zdjgi76Gho3sQw0ib5Mjk3sDyoq9zmGdZdjgi76Gho3sQw0ib5Mjk3sDyoq9zmGdLkJpQw</
reportId>plmkjh8765FGH4rt6As</msg:reportId>
reportId>plmkjh8765FGH4rt6As</msg:reportId>
reportId>poGd56Hnm9q3Dfer6Jh</msg:reportId>
'@ | Set-Content $log -Encoding UTF8
$requestId = @{
Count = 1
Matches = @()
}
$keyId = @{
Count = 1
Matches = @()
}
$reportId = @{
Count = 1
Matches = @()
}
$output = switch -Regex -File $log {
'(\w{6}-\w{6}-\w{6})' {
if(!$requestId.matches.($matches.1))
{
$req = $requestId.matches += @{$matches.1 = "RequestId-$($requestId.count)"}
$requestId.count++
$req.keys | %{ Add-Content $tmp "$_ : $($req.$_)" }
}
$_ -replace $matches.1,$requestId.matches.($matches.1)
}
'keyId>(\w{70})</' {
if(!$keyId.matches.($matches.1))
{
$kid = $keyId.matches += @{$matches.1 = "keyId-$($keyId.count)"}
$keyId.count++
$kid.keys | %{ Add-Content $tmp "$_ : $($kid.$_)" }
}
$_ -replace $matches.1,$keyId.matches.($matches.1)
}
'reportId>(\w{19})</msg:reportId>' {
if(!$reportId.matches.($matches.1))
{
$repid = $reportId.matches += @{$matches.1 = "Report-$($reportId.count)"}
$reportId.count++
$repid.keys | %{ Add-Content $tmp "$_ : $($repid.$_)" }
}
$_ -replace $matches.1,$reportId.matches.($matches.1)
}
default {$_}
}
$output | Set-Content $log -Encoding UTF8
解决方案
由于每行可能有不同的数据组合,我推荐这种方法。
$requestId = @{
Count = 1
Matches = @()
}
$keyId = @{
Count = 1
Matches = @()
}
$reportId = @{
Count = 1
Matches = @()
}
$text = Get-Content $log
$tmp = ".\tmp.txt"
$output = foreach($line in $text)
{
if($line -match '<requestID>(\w{6}-\w{6}-\w{6})</requestID>')
{
if(!$requestId.matches.($matches.1))
{
$req = $requestId.matches += @{$matches.1 = "RequestId-$($requestId.count)"}
$requestId.count++
$req.keys | %{ Add-Content $tmp "$_ : $($req.$_)" }
}
$line = $line -replace $matches.1,$requestId.matches.($matches.1)
}
if($line -match 'reportId>(\w{19})</msg:reportId>')
{
if(!$reportId.matches.($matches.1))
{
$repid = $reportId.matches += @{$matches.1 = "Report-$($reportId.count)"}
$reportId.count++
$repid.keys | %{ Add-Content $tmp "$_ : $($repid.$_)" }
}
$line = $line -replace $matches.1,$reportId.matches.($matches.1)
}
if($line -match 'keyId>(\w{70})</')
{
if(!$keyId.matches.($matches.1))
{
$kid = $keyId.matches += @{$matches.1 = "keyId-$($keyId.count)"}
$keyId.count++
$kid.keys | %{ Add-Content $tmp "$_ : $($kid.$_)" }
}
$line = $line -replace $matches.1,$keyId.matches.($matches.1)
}
$line
}
$output | Set-Content $log -Encoding UTF8
推荐阅读
- c - C 从文件读入链表
- keras - Keras中的图表断开问题
- android - 在 Android 应用中可以进行图像裁剪吗?
- c# - 异步调用同步函数以实现 UI 响应
- unit-testing - 将 Mockito“何时”与可能包含某些值的参数一起使用?
- c++ - 可以(线程安全)使用静态字符数组来延长对象的寿命吗?
- tsql - 从视图中选择需要超过 30 分钟以上
- c++ - Std::vectors - 复制与创建参考
- java - javafx:如何在 TreeView 中隐藏“下拉箭头”?
- javascript - 对象达到 100% 不透明度后停止 JQuery Opacity 基于滚动的效果