正则表达式将包含 "error" 或 "warn"(不区分大小写)的完整行与 golang 匹配



我想将日志文件中包含WARN或ERROR(不区分大小写)的每行的完整行打印给用户。

给定:

[01-17|18:53:38.179] INFO server/server.go:381 this would be skipped
[01-17|18:53:38.280] INFO server/server.go:620 this also
[01-17|18:53:41.180] WARN server/server.go:388 Something is warned, so show this
[01-17|18:53:41.394] WARN server/server.go:188 Something reported an ->error<-
[01-17|18:53:41.395] ERROR server/server.go:191 Blabla
[01-17|18:53:41.395] DEBUG server/server.go:196 Obviously skipped
[01-17|18:53:41.395] DEBUG server/server.go:196 This debug contains an ->error<- so match this
[01-17|18:53:41.395] WARN server/server.go:198 You get the idea

我想:

[01-17|18:53:41.180] WARN server/server.go:388 Something is warned, so show this
[01-17|18:53:41.394] WARN server/server.go:188 Something reported an ->error<-
[01-17|18:53:41.395] ERROR server/server.go:191 Blabla
[01-17|18:53:41.395] DEBUG server/server.go:196 This debug contains an ->error<- so match this
[01-17|18:53:41.395] WARN server/server.go:198 You get the idea

我很天真地从

开始
errorRegEx := regexp.MustCompile(`(?is)error|warn`)

它只会打印(从不同的运行,可能不完全匹配上面的例子)

WARN
error

然后我想我应该把这个修改得更匹配一点:

errorRegEx := regexp.MustCompile(`(?is).*error.*|.*warn.*`)

但是这根本没有打印任何东西

我如何获得完整的行,以及所有行,其中WARN或ERROR(不区分大小写)将匹配?

PS:这与建议的包含字符串的Regex匹配行不同,因为这是针对go语言的,特别是它似乎没有使用完全相同的标准引擎。

考虑到这个问题已经被标记为欺骗,下面是OP的评论

这个问题被标记为重复,并且链接的帖子有许多答案,我们可以使用它们来尝试拼凑出OP问题的答案,但仍然不完全,因为这些答案似乎与PCRE和Go使用RE2有关。

var logs = `
[01-17|18:53:38.179] INFO server/server.go:381 this would be skipped
[01-17|18:53:38.280] INFO server/server.go:620 this also
[01-17|18:53:41.180] Warn server/server.go:388 Something is warned, so show this
[01-17|18:53:41.394] warn server/server.go:188 Something reported an ->error<-
[01-17|18:53:41.395] Error server/server.go:191 Blabla
[01-17|18:53:41.395] DEBUG server/server.go:196 Obviously skipped
[01-17|18:53:41.395] DEBUG server/server.go:196 This debug contains an ->error<- so match this
[01-17|18:53:41.395] WARN server/server.go:198 You get the idea
`
func init() {
logs = strings.TrimSpace(logs)
}

首先,我不明白为什么OP没有打印任何东西:

然后我想我应该把这个改变得更匹配一点:

errorRegEx := regexp.MustCompile(`(?is).*error.*|.*warn.*`)

但是这根本没有打印任何东西

因为它应该打印所有:

fmt.Println("Original regexp:")
reOriginal := regexp.MustCompile(`(?is).*error.*|.*warn.*`)
lines := reOriginal.FindAllString(logs, -1)
fmt.Println("matchttentry")
fmt.Println("=====tt=====")
for i, line := range lines {
fmt.Printf("%dtt%qn", i+1, line)
}
Original regexp:
match           entry
=====           =====
1               "[01-17|18:53:38.179] INFO server/server.go:381 this would be skippedn[01-17|18:53:38.280] INFO server/server.go:620 this alson[01-17|18:53:41.180] Warn server/server.go:388 Something is warned, so show thisn[01-17|18:53:41.394] warn server/server.go:188 Something reported an ->error<-n[01-17|18:53:41.395] Error server/server.go:191 Blablan[01-17|18:53:41.395] DEBUG server/server.go:196 Obviously skippedn[01-17|18:53:41.395] DEBUG server/server.go:196 This debug contains an ->error<- so match thisn[01-17|18:53:41.395] WARN server/server.go:198 You get the idea"

(?is)...中的s标志意味着匹配换行符对点(.)^1,因为你的星号(*)是贪婪的^2,他们将匹配整个字符串中的所有内容,如果其中一个&;或";warn"被发现。

真正的解决方案是不匹配"n"用点去掉s标志,你就得到了你想要的:

fmt.Println("Whole text:")
reWholeText := regexp.MustCompile(`(?i).*error.*|.*warn.*`)
lines = reWholeText.FindAllString(logs, -1)
fmt.Println("matchttentry")
fmt.Println("=====tt=====")
for i, line := range lines {
fmt.Printf("%dtt%qn", i+1, line)
}
Whole text:
match           entry
=====           =====
1               "[01-17|18:53:41.180] Warn server/server.go:388 Something is warned, so show this"
2               "[01-17|18:53:41.394] warn server/server.go:188 Something reported an ->error<-"
3               "[01-17|18:53:41.395] Error server/server.go:191 Blabla"
4               "[01-17|18:53:41.395] DEBUG server/server.go:196 This debug contains an ->error<- so match this"
5               "[01-17|18:53:41.395] WARN server/server.go:198 You get the idea"

现在我们正在匹配"n"(有效的行),因为我们使用的All形式只查找非重叠匹配:

如果'All'存在,则例程匹配整个表达式的连续非重叠匹配。^3

我们得到完整而不同的线条。

你可以把regexp收紧一点:

`(?i).*(?:error|warn).*` // "anything before either "error" or "warn" and anything after (for a line)"

(?:...)是一个非捕获组^1,因为您似乎不关心"error"的单个实例。或";warn">

并且,我仍然想展示在尝试匹配之前按行分割可以为您提供更多的控制/精度,并且使regexp非常容易推理:

r := strings.NewReader(logs)
scanner := bufio.NewScanner(r)
fmt.Println("Line-by-line:")
reLine := regexp.MustCompile(`(?i)error|warn`)
fmt.Println("matchtlinetentry")
fmt.Println("=====t====t=====")
var matchNo, lineNo, match = 1, 1, ""
for scanner.Scan() {
line := scanner.Text()
match = reLine.FindString(line)
if match != "" {
fmt.Printf("%dt%dt%qn", matchNo, lineNo, line)
matchNo++
}
lineNo++
}
Line-by-line:
match   line    entry
=====   ====    =====
1       3       "[01-17|18:53:41.180] Warn server/server.go:388 Something is warned, so show this"
2       4       "[01-17|18:53:41.394] warn server/server.go:188 Something reported an ->error<-"
3       5       "[01-17|18:53:41.395] Error server/server.go:191 Blabla"
4       7       "[01-17|18:53:41.395] DEBUG server/server.go:196 This debug contains an ->error<- so match this"
5       8       "[01-17|18:53:41.395] WARN server/server.go:198 You get the idea"

这三个例子都在这个Playground里。

查找第一个空格后的ERROR和WARN标记:

errorRegEx := regexp.MustCompile(`^[^ ]* (?:ERROR|WARN) .*`)

最新更新