Windows PowerShell:如何分析日志文件



我有一个包含以下内容的输入文件:

27/08/2020  02:47:37.365 (-0516)  hostname12    ult_licesrv       ULT  5  LiceSrv Main[108                    00000  Session 'session1' (from 'vmpms1app1@pmc21app20.pm.com') request for 1 additional licenses for module 'SA-XT' - 1 licenses have been allocated by concurrent usage category 'Unlimited' (session module usage now 1, session category usage now 1, total module concurrent usage now 1, total category usage now 1)
27/08/2020  02:47:37.600 (-0516)  hostname13    ult_licesrv       ULT  5  LiceSrv Main[108                    00000  Session 'sssion2' (from 'vmpms2app1@pmc21app20.pm.com') request for 1 additional licenses for module 'SA-XT-Read' - 1 licenses have been allocated by concurrent usage category 'Floating' (session module usage now 2, session category usage now 2, total module concurrent usage now 1, total category usage now 1)
27/08/2020  02:47:37.115 (-0516)  hostname141    ult_licesrv       CMN  5  Logging Housekee                    00000  Deleting old log file 'C:Program FilesPMCOM GlobalLicense Serverdiag_ult_licesrv_20200824_011130.log.gz' as it exceeds the purge threashold of 72 hours
27/08/2020  02:47:37.115 (-0516)  hostname141    ult_licesrv       CMN  5  Logging Housekee                    00000  Deleting old log file 'C:Program FilesPMCOM GlobalLicense Serverdiag_ult_licesrv_20200824_021310.log.gz' as it exceeds the purge threashold of 72 hours
27/08/2020  02:47:37.625 (-0516)  hostname150    ult_licesrv       ULT  5  LiceSrv Main[108                    00000  Session 'session1' (from 'vmpms1app1@pmc21app20.pm.com') request for 1 additional licenses for module 'SA-XT' - 1 licenses have been allocated by concurrent usage category 'Unlimited' (session module usage now 2, session category usage now 1, total module concurrent usage now 2, total category usage now 1)

我需要生成并输出如下文件:

Date,time,hostname,session_module_usage,session_category_usage,module_concurrent_usage,total_category_usage
27/08/2020,02:47:37.365 (-0516),hostname12,1,1,1,1
27/08/2020,02:47:37.600 (-0516),hostname13,2,2,1,1
27/08/2020,02:47:37.115 (-0516),hostname141,0,0,0,0
27/08/2020,02:47:37.115 (-0516),hostname141,0,0,0,0
27/08/2020,02:47:37.625 (-0516),hostname150,2,1,2,1

输出数据顺序为:日期、时间、主机名、session_module_usage、session_category_usage、module_current_usage和total_categore_usage

如果session_module_usage,session_category_usage,module_concurrent_usage,total_category_usage没有条目,则放入0,0,0

我需要从输入文件中获取内容,并将输出写入另一个文件。

更新

我在F驱动器中创建了一个input.txt文件,并将日志详细信息粘贴到其中。然后,当出现新行时,我通过拆分文件内容来形成一个数组,如下所示。

$myList = (Get-Content -Path F:input.txt) -split 'n'

现在我的数组myList中有5个项目。然后,我用一个空格替换多个空格,并通过空格分割每个元素来形成一个新的数组。然后我打印0到3个数组元素。现在我需要添加结束值(session_module_usage、session_category_usage、module_current_usage和total_category_usage(。

PS C:Usersuser> $myList = (Get-Content -Path F:input.txt) -split 'n'
PS C:Usersuser> $myList.Length
5
PS C:Usersuser> $myList = (Get-Content -Path F:input.txt) -split 'n'
PS C:Usersuser> $myList.Length
5
PS C:Usersuser> for ($i = 0; $i -le ($myList.length - 1); $i += 1) {
>> $newList = ($myList[$i] -replace 's+', ' ') -split ' '
>> $newList[0]+','+$newList[1]+' '+$newList[2]+','+$newList[3]
>>  }
27/08/2020,02:47:37.365 (-0516),hostname12
27/08/2020,02:47:37.600 (-0516),hostname13
27/08/2020,02:47:37.115 (-0516),hostname141
27/08/2020,02:47:37.115 (-0516),hostname141
27/08/2020,02:47:37.625 (-0516),hostname150

如果您确实需要根据所需的粒度进行筛选,那么您可能需要使用regex来筛选行。

这将假设行在您要查找的值之前有类似的标记行,所以请记住这一点。

[System.Collections.ArrayList]$filteredRows = @()
$log = Get-Content -Path C:logfile.log
foreach ($row in $log) {
$rowIndex = $log.IndexOf($row)
$date = ([regex]::Match($log[$rowIndex],'^d+/d+/d+')).value
$time = ([regex]::Match($log[$rowIndex],'d+:d+:d+.d+s(S+)')).value
$hostname = ([regex]::Match($log[$rowIndex],'(?<=dddd)  )w+')).value
$sessionModuleUsage = ([regex]::Match($log[$rowIndex],'(?<=session module usage now )d')).value
if (!$sessionModuleUsage) {
$sessionModuleUsage = 0
}
$sessionCategoryUsage = ([regex]::Match($log[$rowIndex],'(?<=session category usage now )d')).value
if (!$sessionCategoryUsage) {
$sessionCategoryUsage = 0
}
$moduleConcurrentUsage = ([regex]::Match($log[$rowIndex],'(?<=total module concurrent usage now )d')).value
if (!$moduleConcurrentUsage) {
$moduleConcurrentUsage = 0
}
$totalCategoryUsage = ([regex]::Match($log[$rowIndex],'(?<=total category usage now )d')).value
if (!$totalCategoryUsage) {
$totalCategoryUsage = 0
}
$hash = [ordered]@{
Date = $date
time = $time
hostname = $hostname
session_module_usage = $sessionModuleUsage
session_category_usage = $sessionCategoryUsage
module_concurrent_usage = $moduleConcurrentUsage
total_category_usage = $totalCategoryUsage
}
$rowData = New-Object -TypeName 'psobject' -Property $hash
$filteredRows.Add($rowData) > $null
}
$csv = $filteredRows | convertto-csv -NoTypeInformation -Delimiter "," | foreach {$_ -replace '"',''}
$csv | Out-File C:results.csv

本质上需要做的是,我们需要get-content日志,它返回一个数组,每个项都以换行符结束。

一旦我们有了行,我们就需要通过regex获取值由于如果不存在这些值,您希望在某些项中使用零,因此如果regex不返回任何,则我有if语句分配"0">

最后,我们将每个过滤项添加到PSObject中,并在每次迭代中将该对象附加到对象数组中。

然后导出到CSV。

您可能可以很容易地用正则表达式和子字符串来拆分行。基本上如下所示:

# Iterate over the lines of the input file
Get-Content F:input.txt |
ForEach-Object {
# Extract the individual fields
$Date = $_.Substring(0, 10)
$Time = $_.Substring(12, $_.IndexOf(')') - 11)
$Hostname = $_.Substring(34, $_.IndexOf(' ', 34) - 34)
$session_module_usage = 0
$session_category_usage  = 0
$module_concurrent_usage = 0
$total_category_usage = 0
if ($_ -match 'session module usage now (d+), session category usage now (d+), total module concurrent usage now (d+), total category usage now (d+)') {
$session_module_usage = $Matches[1]
$session_category_usage  = $Matches[2]
$module_concurrent_usage = $Matches[3]
$total_category_usage = $Matches[4]
}
# Create custom object with those properties
New-Object PSObject -Property @{
Date = $Date
time = $Time
hostname = $Hostname
session_module_usage = $session_module_usage
session_category_usage = $session_category_usage
module_concurrent_usage = $module_concurrent_usage
total_category_usage = $total_category_usage
}
} |
# Ensure column order in output
Select-Object Date,time,hostname,session_module_usage,session_category_usage,module_concurrent_usage,total_category_usage |
# Write as CSV - without quotes
ConvertTo-Csv -NoTypeInformation |
ForEach-Object { $_ -replace '"' } |
Out-File F:output.csv

是否从带有子字符串或正则表达式的行中提取日期、时间和主机名可能是一个品味问题。格式必须匹配的严格程度也是如此,但对我来说,这主要取决于格式的严格程度。对于更自由的东西,不同的行将匹配不同的正则表达式,或者多行组成一个记录,我也很喜欢switch -Regex迭代这些行。

最新更新