我正在尝试在我的k8s集群中实现EFK堆栈(使用Fluent Bit(。我想解析的日志文件有时是单行的,有时是多行的:
2022-03-13 13:27:04 [-][-][-][error][craftdbConnection::open] SQLSTATE[HY000] [2002] php_network_getaddresses: getaddrinfo failed: Name or service not known
2022-03-13 13:27:04 [-][-][-][info][application] $_GET = []
$_POST = []
$_FILES = []
$_COOKIE = [
'__test1' => 'x'
'__test2' => 'x2'
]
$_SERVER = [
'__test3' => 'x3'
'__test2' => 'x3'
]
当我在Kibana中检查捕获的日志时,我发现所有多行日志都被分隔成单行,这当然不是我们想要的。我试图在fluent bit config中配置一个解析器,它将把多行日志解释为一个条目,但遗憾的是没有成功。
我试过这个:
[PARSER]
Name MULTILINE_MATCH
Format regex
Regex ^d{4}-d{1,2}-d{1,2} d{1,2}:d{1,2}:d{1,2} [-][-][-][(?<level>.*)][(?<where>.*)] (?<message>[sS]*)
Time_Key time
Time_Format %b %d %H:%M:%S
在k8s中,所有fluent位配置都存储在配置映射中。以下是我对fluent bit的全部配置(多行解析器在最后(:
kind: ConfigMap
metadata:
name: fluent-bit
namespace: efk
labels:
app: fluent-bit
data:
# Configuration files: server, input, filters and output
# ======================================================
fluent-bit.conf: |
[SERVICE]
Flush 1
Log_Level info
Daemon off
Parsers_File parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
@INCLUDE input-kubernetes.conf
@INCLUDE filter-kubernetes.conf
@INCLUDE output-elasticsearch.conf
input-kubernetes.conf: |
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/*.log
Parser docker
DB /var/log/flb_kube.db
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 10
filter-kubernetes.conf: |
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Kube_Tag_Prefix kube.var.log.containers.
Merge_Log On
Merge_Log_Key log_processed
K8S-Logging.Parser On
K8S-Logging.Exclude Off
output-elasticsearch.conf: |
[OUTPUT]
Name es
Match *
Host ${FLUENT_ELASTICSEARCH_HOST}
Port ${FLUENT_ELASTICSEARCH_PORT}
Logstash_Format On
Replace_Dots On
Retry_Limit False
parsers.conf: |
[PARSER]
Name apache
Format regex
Regex ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) [(?<time>[^]]*)] "(?<method>S+)(?: +(?<path>[^"]*?)(?: +S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^"]*)" "(?<agent>[^"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name apache2
Format regex
Regex ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) [(?<time>[^]]*)] "(?<method>S+)(?: +(?<path>[^ ]*) +S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^"]*)" "(?<agent>[^"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name apache_error
Format regex
Regex ^[[^ ]* (?<time>[^]]*)] [(?<level>[^]]*)](?: [pid (?<pid>[^]]*)])?( [client (?<client>[^]]*)])? (?<message>.*)$
[PARSER]
Name nginx
Format regex
Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) [(?<time>[^]]*)] "(?<method>S+)(?: +(?<path>[^"]*?)(?: +S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^"]*)" "(?<agent>[^"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name json
Format json
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
[PARSER]
Name syslog
Format regex
Regex ^<(?<pri>[0-9]+)>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_/.-]*)(?:[(?<pid>[0-9]+)])?(?:[^:]*:)? *(?<message>.*)$
Time_Key time
Time_Format %b %d %H:%M:%S
[PARSER]
Name MULTILINE_MATCH
Format regex
Regex ^d{4}-d{1,2}-d{1,2} d{1,2}:d{1,2}:d{1,2} [-][-][-][(?<level>.*)][(?<where>.*)] (?<message>[sS]*)
Time_Key time
Time_Format %b %d %H:%M:%S
从Fluent Bit v1.8开始,您可以使用multiline.parser
选项,如下所示。docker和cri多行解析器是用fluent bit预定义的。
[INPUT]
Name tail
Path /var/log/containers/*.log
multiline.parser docker, cri
Tag kube.*
Mem_Buf_Limit 5MB
Skip_Long_Lines On
https://docs.fluentbit.io/manual/pipeline/inputs/tail#multiline-和容器-v1.8