首页 > 解决方案 > Windows PowerShell:如何解析日志文件?

问题描述

我有一个包含以下内容的输入文件:

27/08/2020  02:47:37.365 (-0516)  hostname12    ult_licesrv       ULT  5  LiceSrv Main[108                    00000  Session 'session1' (from 'vmpms1\app1@pmc21app20.pm.com') request for 1 additional licenses for module 'SA-XT' - 1 licenses have been allocated by concurrent usage category 'Unlimited' (session module usage now 1, session category usage now 1, total module concurrent usage now 1, total category usage now 1)
27/08/2020  02:47:37.600 (-0516)  hostname13    ult_licesrv       ULT  5  LiceSrv Main[108                    00000  Session 'sssion2' (from 'vmpms2\app1@pmc21app20.pm.com') request for 1 additional licenses for module 'SA-XT-Read' - 1 licenses have been allocated by concurrent usage category 'Floating' (session module usage now 2, session category usage now 2, total module concurrent usage now 1, total category usage now 1)
27/08/2020  02:47:37.115 (-0516)  hostname141    ult_licesrv       CMN  5  Logging Housekee                    00000  Deleting old log file 'C:\Program Files\PMCOM Global\License Server\diag_ult_licesrv_20200824_011130.log.gz' as it exceeds the purge threashold of 72 hours
27/08/2020  02:47:37.115 (-0516)  hostname141    ult_licesrv       CMN  5  Logging Housekee                    00000  Deleting old log file 'C:\Program Files\PMCOM Global\License Server\diag_ult_licesrv_20200824_021310.log.gz' as it exceeds the purge threashold of 72 hours
27/08/2020  02:47:37.625 (-0516)  hostname150    ult_licesrv       ULT  5  LiceSrv Main[108                    00000  Session 'session1' (from 'vmpms1\app1@pmc21app20.pm.com') request for 1 additional licenses for module 'SA-XT' - 1 licenses have been allocated by concurrent usage category 'Unlimited' (session module usage now 2, session category usage now 1, total module concurrent usage now 2, total category usage now 1)

我需要生成和输出如下文件:

Date,time,hostname,session_module_usage,session_category_usage,module_concurrent_usage,total_category_usage
27/08/2020,02:47:37.365 (-0516),hostname12,1,1,1,1
27/08/2020,02:47:37.600 (-0516),hostname13,2,2,1,1
27/08/2020,02:47:37.115 (-0516),hostname141,0,0,0,0
27/08/2020,02:47:37.115 (-0516),hostname141,0,0,0,0
27/08/2020,02:47:37.625 (-0516),hostname150,2,1,2,1

输出数据顺序为:日期、时间、主机名、session_module_usage、session_category_usage、module_concurrent_usage、total_category_usage。

如果没有条目,则放置0,0,0,0session_module_usage,session_category_usage,module_concurrent_usage,total_category_usage

我需要从输入文件中获取内容并将输出写入另一个文件。

更新

我在 F 驱动器中创建了一个文件 input.txt 并将日志详细信息粘贴到其中。然后,当出现如下新行时,我通过拆分文件内容来形成一个数组。

$myList = (Get-Content -Path F:\input.txt) -split '\n'

现在我的数组中有 5 个项目myList。然后我用一个空格替换多个空格,并通过空格分割每个元素形成一个新数组。然后我打印 0 到 3 个数组元素。现在我需要添加最终值(session_module_usage、session_category_usage、module_concurrent_usage、total_category_usage)。

PS C:\Users\user> $myList = (Get-Content -Path F:\input.txt) -split '\n'
PS C:\Users\user> $myList.Length
5
    PS C:\Users\user> $myList = (Get-Content -Path F:\input.txt) -split '\n'
PS C:\Users\user> $myList.Length
5
PS C:\Users\user> for ($i = 0; $i -le ($myList.length - 1); $i += 1) {
>> $newList = ($myList[$i] -replace '\s+', ' ') -split ' '
>> $newList[0]+','+$newList[1]+' '+$newList[2]+','+$newList[3]
>>  }
27/08/2020,02:47:37.365 (-0516),hostname12
27/08/2020,02:47:37.600 (-0516),hostname13
27/08/2020,02:47:37.115 (-0516),hostname141
27/08/2020,02:47:37.115 (-0516),hostname141
27/08/2020,02:47:37.625 (-0516),hostname150

标签: powershell

解决方案


如果您确实需要过滤您正在寻找的粒度,那么您可能需要使用正则表达式来过滤行。

这将假定行在您要查找的值之前具有类似标记的行,因此请记住这一点。

[System.Collections.ArrayList]$filteredRows = @()
$log = Get-Content -Path C:\logfile.log
foreach ($row in $log) {
    $rowIndex = $log.IndexOf($row)
    $date = ([regex]::Match($log[$rowIndex],'^\d+\/\d+\/\d+')).value
    $time = ([regex]::Match($log[$rowIndex],'\d+:\d+:\d+\.\d+\s\(\S+\)')).value
    $hostname = ([regex]::Match($log[$rowIndex],'(?<=\d\d\d\d\)  )\w+')).value
    $sessionModuleUsage = ([regex]::Match($log[$rowIndex],'(?<=session module usage now )\d')).value
    if (!$sessionModuleUsage) {
        $sessionModuleUsage = 0
    }
    $sessionCategoryUsage = ([regex]::Match($log[$rowIndex],'(?<=session category usage now )\d')).value
    if (!$sessionCategoryUsage) {
        $sessionCategoryUsage = 0
    }
    $moduleConcurrentUsage = ([regex]::Match($log[$rowIndex],'(?<=total module concurrent usage now )\d')).value
    if (!$moduleConcurrentUsage) {
        $moduleConcurrentUsage = 0
    }
    $totalCategoryUsage = ([regex]::Match($log[$rowIndex],'(?<=total category usage now )\d')).value
    if (!$totalCategoryUsage) {
        $totalCategoryUsage = 0
    }
    $hash = [ordered]@{
        Date = $date
        time = $time
        hostname = $hostname
        session_module_usage = $sessionModuleUsage
        session_category_usage = $sessionCategoryUsage
        module_concurrent_usage = $moduleConcurrentUsage
        total_category_usage = $totalCategoryUsage
    }
    $rowData = New-Object -TypeName 'psobject' -Property $hash
    $filteredRows.Add($rowData) > $null
}
$csv = $filteredRows | convertto-csv -NoTypeInformation -Delimiter "," | foreach {$_ -replace '"',''}
$csv | Out-File C:\results.csv

本质上需要发生的是我们需要get-content日志,它返回一个数组,其中每个项目都以换行符结束。

一旦我们有了行,我们需要通过正则表达式获取值因为如果这些值不存在,您希望某些项目中的零,所以如果正则表达式不返回任何内容,我有 if 语句分配“0”

最后,我们将每个过滤后的项目添加到 aPSObject中,并在每次迭代中将该对象附加到对象数组中。

然后导出为 CSV。


推荐阅读