使用:input | filter | output> ElasticSearch处理数据后,数据存储的格式有点像:
"_index": "logstash-2012.07.02",
"_type": "stdin",
"_id": "JdRaI5R6RT2do_WhCYM-qg",
"_score": 0.30685282,
"_source": {
"@source": "stdin://dist/",
"@type": "stdin",
"@tags": [
"tag1",
"tag2"
],
"@fields": {},
"@timestamp": "2012-07-02T06:17:48.533000Z",
"@source_host": "dist",
"@source_path": "/",
"@message": "test"
}
我在特定字段中过滤/存储大多数重要信息,是否可以省略默认字段,如:@source_path和@source_host?在不久的将来,它将存储80亿个日志/月,我想运行一些性能测试,排除这些默认字段(我只是不使用这些字段)。
从输出中删除字段:
filter {
mutate {
# remove duplicate fields
# this leaves timestamp from message and source_path for source
remove => ["@timestamp", "@source"]
}
}
这取决于你使用什么web界面来查看日志。我正在使用Kibana和一个客户记录器(c#),它对以下内容进行索引:
{
"_index": "logstash-2013.03.13",
"_type": "logs",
"_id": "n3GzIC68R1mcdj6Wte6jWw",
"_version": 1,
"_score": 1,
"_source":
{
"@source": "File",
"@message": "Shalom",
"@fields":
{
"tempor": "hit"
},
"@tags":
[
"tag1"
],
"level": "Info"
"@timestamp": "2013-03-13T21:47:51.9838974Z"
}
}
在Kibana中显示,而源字段不在那里。
要排除某些字段,可以使用修剪过滤器插件。
filter {
prune {
blacklist_names => [ "@timestamp", "@source" ]
}
}
修剪过滤器不是logstash的默认插件,必须先安装:
bin/logstash-plugin install logstash-filter-prune