I'm Update问题:
打开一个新的问题线程,前面的问题是删除行后匹配时结束特定的字母。如果匹配特定的行,Regex和选择块文本,不像那些问题,我需要根据另一行的匹配结果选择一行。
将帮助SCHEDULE行选择块文本并匹配最后一个单词,然后在块文本中再次找到以字母E结尾的单词,并且总是有#符号。
块文本开始行安排结束行END(复制到另一个文件)
在任何情况下SCHEDULE行都有#符号
SCHEDULE MANAGER_XA#KGDIVAGBLR
or
SCHEDULE MANAGER_XA#KGICROBLR_2
or
SCHEDULE MASTERAGENTS#KGICRO741_AABB
or
SCHEDULE MANAGER_XA#/XAAA/KAAA/KGICROZZZ
or
SCHEDULE MANAGER_XA#/XAAA/KAAA/KGFLABUR_4
or
SCHEDULE MASTERAGENTS#/KA0H/KA0HM00_FACT/KA0HM00_FACT
END
例如块文本(开始行SCHEDULE a finish in line END):
SCHEDULE MANAGER_XA#/XAAA/KAAA/KGICROZZZ
DESCRIPTION "Added by default."
:
S89COLENG2#/KG34/KG34G43CR3/KG34G1086
NOP
FOLLOWS KG34G1085
S89COLENG2#/KG34/KG34G43CR3/KGICROZZZE
FOLLOWS KG34G493
FOLLOWS KG34G522
S89COLENG2#/KG34/KG34G43CR3/KG34G1086
NOP
FOLLOWS KG34G1085
END
和单词KGICROZZZE是匹配的,因为从SCHEDULE行的最后一个单词开始,以字母E结尾
如果在SCHEDULE行中KGFLABUR_4中完成的最后一个单词(下划线+另一个单词)的匹配在下划线之前,可以在文本块中找到KGFLABURE
SCHEDULE MANAGER_XA#/XAAA/KAAA/KGFLABUR_4
S89COLENG2#/KG34/KG34G43CR3/KGFLABURE
or
S89COLENG2#/KG34/KG34G43CR3/KGFLABURE_4
我需要它们2 regex:
- 一个从SCHEDULE行中最后一个单词的名称开始,以字母E结束的块文本和相关的SCHEDULE块文本中识别行。
跟踪块文本示例:
SCHEDULE MANAGER_XA#/XAAA/KAAA/KGICROZZZ
DESCRIPTION "Added by default."
:
S89COLENG2#/KG34/KG34G43CR3/KG34G1086
NOP
FOLLOWS KG34G1085
S89COLENG2#/KG34/KG34G43CR3/KGICROZZZE
FOLLOWS KG34G493
FOLLOWS KG34G522
NOP
S89COLENG2#/KG34/KG34G43CR3/KG34G1086
NOP
FOLLOWS KG34G1085
END
或在本例中,SCHEDULE行在KAAABBB_CCC中完成,匹配在下划线KAAABBB之前
SCHEDULE MANAGER_XA#/XAAA/KAAA/KAAABBB_CCC
DESCRIPTION "Added by default."
:
S89COLENG2#/KG34/KG34G43CR3/KG34G1086
NOP
FOLLOWS KG34G1085
S89COLENG2#/KG34/KG34G43CR3/KAAABBBE_CCC
FOLLOWS KG34G493
FOLLOWS KG34G522
S89COLENG2#/KG34/KG34G43CR3/KG34G1086
NOP
FOLLOWS KG34G1085
END
- 一个在文本块中识别的行,没有从SCHEDULE行中最后一个单词的名字开始,以字母E结束的行
跟踪块文本示例:
SCHEDULE MANAGER_XA#/XAAA/KAAA/KXXXYYYY
DESCRIPTION "Added by default."
:
S89COLENG2#/KG34/KG34G43CR3/KG34G1086
NOP
FOLLOWS KG34G1085
S89COLENG2#/KG34/KG34G43CR3/KG34G1020
NOP
FOLLOWS KG34G1085
END
如果文字太长,我很抱歉,但我也不得不写例子,以便能够更好地解释自己。我也试着缩短它。如果你需要更多的信息,请让我知道更新问题。
致意。
《
正如我在评论中提到的,这里有一个例子(在PowerShell中),我如何首先获取所有单独的SCHEDULE <—>END块,然后使用检查匹配的正则表达式将它们分成匹配和不匹配的组。
# Read lines from file into $text variable
$text = Get-Content -Raw -Path c:temppowershellschedule.log
# Use regex class to find all SCHEDULE <--> END blocks in $text
$scheduleBlockMatches = [regex]::matches($text, '(?sm)SCHEDULE.*?END')
# Define matching pattern in a variable called $matchPattern
$matchPattern = '(?m)SCHEDULE.*[/#]([^_n]+)(?=(?:[n]|.)+1E)(?:n|.)+?(^.*?1E.*)(?:n|.)+?END'
# For each SCHEDULE <--> END block in $scheduledBlockMatches use Where() to see if it matches pattern
# Specifying split as an argument to Where() will give us both both true and false sets
# which will be placed in our specified variables '$matched' and '$notMatched'
$matched, $notmatched = $scheduleBlockMatches.Value.Where({ $_ -match $matchPattern }, 'split')
# Create a simple object to display the counts of and first examples of each collection
[PSCustomObject]@{
TotalLinesInLog = ($text -split 'n').Count
TotalScheduleBlocks = $scheduleBlockMatches.Count
MatchedCount = $matched.Count
NotMatchedCount = $notmatched.Count
FirstMatched = $matched[0]
FirstNotMatched = $notmatched[0]
}
自定义对象的输出如下所示
TotalLinesInLog : 228696
TotalScheduleBlocks : 15120
MatchedCount : 9450
NotMatchedCount : 5670
FirstMatched : SCHEDULE MASTERAGENTS#KA96G01
DESCRIPTION "Added by composer."
:
S89COLENG2#/KA96/KA96G01/KA96G065
FOLLOWS KA96G030
S89COLENG2#/KA96/KA96G01/KA96G01E
FOLLOWS KA96G036
FOLLOWS KA96G038
MASTERAGENTS#SBP_KA96G114_KA96G09_KA96G112
FOLLOWS KA96G114
END
FirstNotMatched : SCHEDULE MASTERAGENTS#KA96GAA_5
DESCRIPTION "Added by composer."
:
S89COLENG2#/KA96/KA96G02/KA96G091
FOLLOWS KA96G090
S89COLENG2#/KA96/KA96G02/KA96G096
FOLLOWS KA96G060
END