如何从第一个特定匹配结果(另一行)中查找值,正则表达式



I'm Update问题:

打开一个新的问题线程,前面的问题是删除行后匹配时结束特定的字母。如果匹配特定的行,Regex和选择块文本,不像那些问题,我需要根据另一行的匹配结果选择一行。

将帮助SCHEDULE行选择块文本并匹配最后一个单词,然后在块文本中再次找到以字母E结尾的单词,并且总是有#符号。

块文本开始行安排结束行END(复制到另一个文件)

在任何情况下SCHEDULE行都有#符号

SCHEDULE MANAGER_XA#KGDIVAGBLR 
or
SCHEDULE MANAGER_XA#KGICROBLR_2 
or
SCHEDULE MASTERAGENTS#KGICRO741_AABB
or
SCHEDULE MANAGER_XA#/XAAA/KAAA/KGICROZZZ
or
SCHEDULE MANAGER_XA#/XAAA/KAAA/KGFLABUR_4
or
SCHEDULE MASTERAGENTS#/KA0H/KA0HM00_FACT/KA0HM00_FACT 
END

例如块文本(开始行SCHEDULE a finish in line END):

SCHEDULE MANAGER_XA#/XAAA/KAAA/KGICROZZZ
DESCRIPTION "Added by default."
:
S89COLENG2#/KG34/KG34G43CR3/KG34G1086
NOP
FOLLOWS KG34G1085
S89COLENG2#/KG34/KG34G43CR3/KGICROZZZE
FOLLOWS KG34G493
FOLLOWS KG34G522
S89COLENG2#/KG34/KG34G43CR3/KG34G1086
NOP
FOLLOWS KG34G1085
END

和单词KGICROZZZE是匹配的,因为从SCHEDULE行的最后一个单词开始,以字母E结尾

如果在SCHEDULE行中KGFLABUR_4中完成的最后一个单词(下划线+另一个单词)的匹配在下划线之前,可以在文本块中找到KGFLABURE

SCHEDULE MANAGER_XA#/XAAA/KAAA/KGFLABUR_4
S89COLENG2#/KG34/KG34G43CR3/KGFLABURE
or 
S89COLENG2#/KG34/KG34G43CR3/KGFLABURE_4

我需要它们2 regex:

  • 一个从SCHEDULE行中最后一个单词的名称开始,以字母E结束的块文本和相关的SCHEDULE块文本中识别行。

跟踪块文本示例:

SCHEDULE MANAGER_XA#/XAAA/KAAA/KGICROZZZ
DESCRIPTION "Added by default."
:
S89COLENG2#/KG34/KG34G43CR3/KG34G1086
NOP
FOLLOWS KG34G1085

S89COLENG2#/KG34/KG34G43CR3/KGICROZZZE
FOLLOWS KG34G493
FOLLOWS KG34G522
NOP

S89COLENG2#/KG34/KG34G43CR3/KG34G1086
NOP
FOLLOWS KG34G1085

END

或在本例中,SCHEDULE行在KAAABBB_CCC中完成,匹配在下划线KAAABBB之前

SCHEDULE MANAGER_XA#/XAAA/KAAA/KAAABBB_CCC
DESCRIPTION "Added by default."
:
S89COLENG2#/KG34/KG34G43CR3/KG34G1086
NOP
FOLLOWS KG34G1085

S89COLENG2#/KG34/KG34G43CR3/KAAABBBE_CCC
FOLLOWS KG34G493
FOLLOWS KG34G522

S89COLENG2#/KG34/KG34G43CR3/KG34G1086
NOP
FOLLOWS KG34G1085

END


  • 一个在文本块中识别的行,没有从SCHEDULE行中最后一个单词的名字开始,以字母E结束的行

跟踪块文本示例:

SCHEDULE MANAGER_XA#/XAAA/KAAA/KXXXYYYY
DESCRIPTION "Added by default."
:
S89COLENG2#/KG34/KG34G43CR3/KG34G1086
NOP
FOLLOWS KG34G1085

S89COLENG2#/KG34/KG34G43CR3/KG34G1020
NOP
FOLLOWS KG34G1085

END

如果文字太长,我很抱歉,但我也不得不写例子,以便能够更好地解释自己。我也试着缩短它。如果你需要更多的信息,请让我知道更新问题。

致意。

正如我在评论中提到的,这里有一个例子(在PowerShell中),我如何首先获取所有单独的SCHEDULE <—>END块,然后使用检查匹配的正则表达式将它们分成匹配和不匹配的组。

# Read lines from file into $text variable
$text = Get-Content -Raw -Path c:temppowershellschedule.log
# Use regex class to find all SCHEDULE <--> END blocks in $text
$scheduleBlockMatches = [regex]::matches($text, '(?sm)SCHEDULE.*?END')
# Define matching pattern in a variable called $matchPattern
$matchPattern = '(?m)SCHEDULE.*[/#]([^_n]+)(?=(?:[n]|.)+1E)(?:n|.)+?(^.*?1E.*)(?:n|.)+?END'
# For each SCHEDULE <--> END block in $scheduledBlockMatches use Where() to see if it matches pattern
# Specifying split as an argument to Where() will give us both both true and false sets
# which will be placed in our specified variables '$matched' and '$notMatched'
$matched, $notmatched = $scheduleBlockMatches.Value.Where({ $_ -match $matchPattern }, 'split')
# Create a simple object to display the counts of and first examples of each collection
[PSCustomObject]@{
TotalLinesInLog     = ($text -split 'n').Count
TotalScheduleBlocks = $scheduleBlockMatches.Count
MatchedCount        = $matched.Count
NotMatchedCount     = $notmatched.Count
FirstMatched        = $matched[0]
FirstNotMatched     = $notmatched[0]
}

自定义对象的输出如下所示

TotalLinesInLog     : 228696
TotalScheduleBlocks : 15120
MatchedCount        : 9450
NotMatchedCount     : 5670
FirstMatched        : SCHEDULE MASTERAGENTS#KA96G01
DESCRIPTION "Added by composer."
:
S89COLENG2#/KA96/KA96G01/KA96G065
FOLLOWS KA96G030
S89COLENG2#/KA96/KA96G01/KA96G01E
FOLLOWS KA96G036
FOLLOWS KA96G038
MASTERAGENTS#SBP_KA96G114_KA96G09_KA96G112
FOLLOWS KA96G114
END
FirstNotMatched     : SCHEDULE MASTERAGENTS#KA96GAA_5
DESCRIPTION "Added by composer."
:
S89COLENG2#/KA96/KA96G02/KA96G091
FOLLOWS KA96G090

S89COLENG2#/KA96/KA96G02/KA96G096
FOLLOWS KA96G060
END

相关内容

  • 没有找到相关文章

最新更新