首页 > 解决方案 > PowerShell - 在同一行查找和替换多个模式并将对应关系存储在单独的文件中

问题描述

我有一个包含数千行的非常大的文件,其中一些行非常非常长,包含各种数据。我需要在该文件中查找并替换多个字符串,并且要替换的几个字符串可以在同一行。同时,替换值应在每次出现时递增。在单独的文件 $tmp 中,我只需要保留“原始”值的“唯一”对和相应的“替换”值,以防需要恢复原始值。在 Doug Maurer 的大力帮助下,我找到了下面的脚本,它完成了大部分工作,但我仍然不知道如何替换同一行上的第二、第三等字符串以及如何只保留“唯一”对. 有任何想法吗?
输入:

<requestId>qwerty-qwer12-qwer56</requestId>something here.,. reportId>plmkjh8765FGH4rt6As</msg:reportId
<requestId>zxcvbn-zxcv12-zxcv56</requestId>
<requestId>qwerty-qwer12-qwer56</requestId>something else.,.reportId>poGd56Hnm9q3Dfer6Jh</msg:reportId>

期望的输出:

<requestId>RequestId-1</requestId>something here.,. reportId>Report-1</msg:reportId
<requestId>RequestId-2</requestId>
<requestId>RequestId-1</requestId>something else.,.reportId>Report-2</msg:reportId

$tmp 的期望输出:

qwerty-qwer12-qwer56 : RequestId-1
plmkjh8765FGH4rt6As : Report-1
zxcvbn-zxcv12-zxcv56 : RequestId-2
poGd56Hnm9q3Dfer6Jh : Report-2
$tmp = ".\tmp.txt"
@'
Order: Q2we45-Uj87f6-gh65De
reportId>plmkjh8765FGH4rt6As</msg:reportId>
<requestId>qwerty-qwer12-qwer56</requestId>Ace of Base Order: Q2we45-Uj87f6-gh65De<something else...
<requestId>zxcvbn-zxcv12-zxcv56</requestId>
<requestId>1234qw-12qw12-123456</requestId>kljsldjslddsdfdsdsdfff <messageId>1234qw-12qw12-123456</msg
<requestId>1234qw-12qw12-123456</requestId>something here.,. reportId>plmkjh8765FGH4rt6As</msg:reportId
<requestId>1234qw-12qw12-123456</requestId>something else.,.reportId>poGd56Hnm9q3Dfer6Jh</msg:reportId> uraaa 123 <keyID>poU6Ghk89edfTG78Jk45GrRt23HzW4pl</msgdc
<requestId>zxcvbn-zxcv12-zxcv56</requestId>
<requestId>1234qw-12qw12-123456</requestId> abcdef ole ole Order: zxcvbn-zxcv12-zxcv56 abracadabra <keyID>poU6Ghk89edfTG78Jk45GrRt23HzW4pl</msgdc
reportId>plmkjh8765FGH4rt6As</msg:reportId>
<requestId>1234qw-12qw12-12qw56</requestId>
keyId>Qwd84lPhjutf7Nmwr56hJndcsjy34imNQwd84lPhjutZ7Nmwr56hJndcsjy34imNPozDr5</
keyId>Qwd84lPhjutf7Nmwr56hJndcsjy34imNQwd84lPhjutZ7Nmwr56hJndcsjy34imNPozDr5</
keyId>Zdjgi76Gho3sQw0ib5Mjk3sDyoq9zmGdZdjgi76Gho3sQw0ib5Mjk3sDyoq9zmGdLkJpQw</
reportId>plmkjh8765FGH4rt6As</msg:reportId>
reportId>plmkjh8765FGH4rt6As</msg:reportId>
reportId>poGd56Hnm9q3Dfer6Jh</msg:reportId>
'@ | Set-Content $log -Encoding UTF8

$requestId = @{
    Count   = 1
    Matches = @()
}
$keyId  = @{
    Count   = 1
    Matches = @()
}
$reportId  = @{
    Count   = 1
    Matches = @()
}

$output = switch -Regex -File $log {
    '(\w{6}-\w{6}-\w{6})' {
        if(!$requestId.matches.($matches.1))
        {
            $req = $requestId.matches += @{$matches.1 = "RequestId-$($requestId.count)"}
            $requestId.count++
            $req.keys | %{ Add-Content $tmp "$_ : $($req.$_)" }
        }
        $_ -replace $matches.1,$requestId.matches.($matches.1)               
    }
    'keyId>(\w{70})</' {
        if(!$keyId.matches.($matches.1))
        {
            $kid = $keyId.matches += @{$matches.1 = "keyId-$($keyId.count)"} 
            $keyId.count++
            $kid.keys | %{ Add-Content $tmp "$_ : $($kid.$_)" }
        }
        $_ -replace $matches.1,$keyId.matches.($matches.1)        
    }
    'reportId>(\w{19})</msg:reportId>' {
        if(!$reportId.matches.($matches.1))
        {
            $repid = $reportId.matches += @{$matches.1 = "Report-$($reportId.count)"}
            $reportId.count++
            $repid.keys | %{ Add-Content $tmp "$_ : $($repid.$_)" }
        }
        $_ -replace $matches.1,$reportId.matches.($matches.1)
    } 
    default {$_}
}

$output | Set-Content $log -Encoding UTF8

标签: regexpowershell

解决方案


由于每行可能有不同的数据组合,我推荐这种方法。

$requestId = @{
    Count   = 1
    Matches = @()
}
$keyId  = @{
    Count   = 1
    Matches = @()
}
$reportId  = @{
    Count   = 1
    Matches = @()
}

$text = Get-Content $log

$tmp = ".\tmp.txt"

$output = foreach($line in $text)
{
    if($line -match '<requestID>(\w{6}-\w{6}-\w{6})</requestID>')
    {
        if(!$requestId.matches.($matches.1))
        {
            $req = $requestId.matches += @{$matches.1 = "RequestId-$($requestId.count)"}
            $requestId.count++
            $req.keys | %{ Add-Content $tmp "$_ : $($req.$_)" }
        }
        $line = $line -replace $matches.1,$requestId.matches.($matches.1)
    }
    if($line -match 'reportId>(\w{19})</msg:reportId>')
    {
        if(!$reportId.matches.($matches.1))
        {
            $repid = $reportId.matches += @{$matches.1 = "Report-$($reportId.count)"}
            $reportId.count++
            $repid.keys | %{ Add-Content $tmp "$_ : $($repid.$_)" }
        }
        $line = $line -replace $matches.1,$reportId.matches.($matches.1)
    }
    if($line -match 'keyId>(\w{70})</')
    {
        if(!$keyId.matches.($matches.1))
        {
            $kid = $keyId.matches += @{$matches.1 = "keyId-$($keyId.count)"}
            $keyId.count++
            $kid.keys | %{ Add-Content $tmp "$_ : $($kid.$_)" }
        }
        $line = $line -replace $matches.1,$keyId.matches.($matches.1)
    }
    $line
}

$output | Set-Content $log -Encoding UTF8

推荐阅读