Powershell 文本搜索 - 多个匹配项



我有一组包含以下字符串中的一两个的.txt文件。

"red", "blue", "green", "orange", "purple", ....列表中还有更多(50+)的可能性。

如果有帮助,我可以判断.txt文件是否包含一个或两个项目,但不知道它们是哪一个/一个。字符串模式始终位于自己的行上。

我希望脚本具体告诉我它找到了哪一个或两个字符串匹配(来自主列表),以及找到它们的顺序。(哪一个是第一个)

由于我有很多文本文件要搜索,我想在搜索时将输出结果写入 CSV 文件。

FILENAME1,first_match,second_match
file1.txt,blue,red
file2.txt,red, blue
file3.txt,orange,
file4.txt,purple,red
file5.txt,purple,
...

我尝试使用许多单独的Select-Strings返回布尔结果来设置找到的任何匹配项的变量,但是随着可能字符串的数量,它很快就会变得丑陋。 我对这个问题的搜索结果没有为我提供任何新的想法可以尝试。 (我确定我没有以正确的方式询问)

我是否需要遍历每个文件中的每一行文本?

我是否通过检查每个搜索字符串是否存在来坚持消除方法的过程?

我正在寻找一种更优雅的方法来解决这个问题。(如果存在)

不是很直观,但很优雅...

以下开关语句

$regex = "(purple|blue|red)"
Get-ChildItem $env:TEMPtest*.txt | Foreach-Object{
$result = $_.FullName
switch -Regex -File $_
{
$regex {$result = "$($result),$($matches[1])"}
}
$result
}

返回

C:UsersLieven KeersmaekersAppDataLocalTemptestfile1.txt,blue,red
C:UsersLieven KeersmaekersAppDataLocalTemptestfile2.txt,red,blue

哪里

  • file1首先包含blue,然后包含red
  • file2首先包含red,然后包含blue

您可以使用正则表达式进行搜索以获取索引(行中的startpos.)与返回行号的Select-String相结合,您就可以开始了。

Select-String支持数组作为-Pattern的值,但不幸的是,即使您使用-AllMatches,它在第一次匹配后也会停在一行上(错误?因此,我们必须为每个单词/模式搜索一次。尝试:

#List of words. Had to escape them because Select-String doesn't return Matches-objects (with Index/location) for SimpleMatch
$words = "purple","blue","red" | ForEach-Object { [regex]::Escape($_) }
#Can also use a list with word/sentence per line using $words = Get-Content patterns.txt | % { [regex]::Escape($_.Trim()) }
#Get all files to search
Get-ChildItem -Filter "test.txt" -Recurse | Foreach-Object { 
#Has to loop words because Select-String -Pattern "blue","red" won't return match for both pattern. It stops on a line after first match
foreach ($word in $words) {
$_ | Select-String -Pattern $word |
#Select the properties we care about
Select-Object Path, Line, Pattern, LineNumber, @{n="Index";e={$_.Matches[0].Index}}
}
} |
#Sort by File (to keep file-matches together), then LineNumber and Index to get the order of matches
Sort-Object Path, LineNumber, Index |
Export-Csv -NoTypeInformation -Path Results.csv -Encoding UTF8

结果.csv

"Path","Line","Pattern","LineNumber","Index"
"C:UsersfrodeDownloadstest.txt","file1.txt,blue,red","blue","3","10"
"C:UsersfrodeDownloadstest.txt","file1.txt,blue,red","red","3","15"
"C:UsersfrodeDownloadstest.txt","file2.txt,red, blue","red","4","10"
"C:UsersfrodeDownloadstest.txt","file2.txt,red, blue","blue","4","15"
"C:UsersfrodeDownloadstest.txt","file4.txt,purple,red","purple","6","10"
"C:UsersfrodeDownloadstest.txt","file4.txt,purple,red","red","6","17"
"C:UsersfrodeDownloadstest.txt","file5.txt,purple,","purple","7","10"

最新更新