显示Apache日志中每小时的IP地址和IP地址计数



如何使用awk解析Apache访问日志文件以按以下格式显示信息?

Date     Time  Count   IP Address
2016-05-26  00:00  200    192.168.1.x
2016-05-26  00:00  152    172.17.100.x
2016-05-26  00:01   43    192.168.1.x

让我说清楚。我不想显示每小时的总请求数。我不想显示每分钟的请求总数。我知道如何编写基本的awk脚本来执行这两项任务。

希望查看每个唯一IP地址每分钟发送多少请求。我对awk不够了解,无法做到这一点。

Apache日志格式

LogFormat "%h %l %u %{%F %T %z}t "%r" %>s %O "%{Referer}i" "%{User-Agent}i""

样本

我跟踪了日志文件的末尾。这是它所包含内容的一个小样本。(我们今天有超过10万个参赛作品。在这里分享它们是不可行的。如果需要更多的线路,请询问。)

54.213.236.39 - - 2016-05-26 14:38:51 -0400 "GET /p1077921.html HTTP/1.0" 403 400 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_77)"
54.213.236.39 - - 2016-05-26 14:38:51 -0400 "GET /p1060432.html HTTP/1.0" 403 398 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_77)"
54.213.254.166 - - 2016-05-26 14:38:51 -0400 "GET /p819757.html HTTP/1.0" 403 400 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_77)"
54.213.236.39 - - 2016-05-26 14:38:51 -0400 "GET /p1084269.html HTTP/1.0" 403 400 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_77)"
107.23.252.229 - - 2016-05-26 14:38:51 -0400 "GET /p305987.html HTTP/1.0" 403 399 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_77)"

示例1:

grep '2016-05-26' access.log | awk '{print $1}' | sort | uniq -c | sort -n | tail -40 | awk '{print $2,$2,$1}' | logresolve | awk '{printf "%6d %s (%s)n",$3,$1,$2}'

产生以下输出

307 135-23-174-138.cpe.pppoe.ca (135.23.174.138)
313 5265DCE5.cm-8.dynamic.ziggo.nl (82.101.220.229)
378 92-108-204-76.dynamic.upc.nl (92.108.204.76)
405 0191301456.0.fullrate.ninja (90.185.180.167)
632 ec2-52-58-151-132.eu-central-1.compute.amazonaws.com (52.58.151.132)
798 187.228.212.148 (187.228.212.148)
877 207.246.75.253 (207.246.75.253)
966 ec2-54-213-177-120.us-west-2.compute.amazonaws.com (54.213.177.120)
1116 ec2-54-186-148-0.us-west-2.compute.amazonaws.com (54.186.148.0)
1224 ppp121-44-247-209.bras2.syd2.internode.on.net (121.44.247.209)
1369 ec2-54-187-239-46.us-west-2.compute.amazonaws.com (54.187.239.46)
1584 45.55.189.64 (45.55.189.64)
2658 50-77-47-70-static.hfc.comcastbusiness.net (50.77.47.70)

示例2:

grep "2016-05-26" access.log | awk '{ print $4, $5, $1}' | cut -f2 | awk -F: '{ print $1":"$2 }' | sort -nk1 -nk2 | uniq -c | awk '{ if ($1 > 10) print $0 }'

这会产生以下输出:

560 2016-05-26 00:00
534 2016-05-26 00:01
538 2016-05-26 00:02
554 2016-05-26 00:03
566 2016-05-26 00:04
534 2016-05-26 00:05
559 2016-05-26 00:06
531 2016-05-26 00:07
540 2016-05-26 00:08
435 2016-05-26 00:09
312 2016-05-26 00:10

我们非常感谢所有的帮助。

这里有一种方法:

首先,转换这个:

54.213.236.39 - - 2016-05-26 14:38:51 -0400 "GET /p1077921.html HTTP/1.0" 403 400 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_77)"

到此:

54.213.236.39 2016-05-26 14  # <- 14th hour

则CCD_ 1表示。

grep '2016-05-26' access.log |
tr ':' ' ' |
awk '{print $1,$4,$5}' |
sort |
uniq -c |
sort -n

相关内容

  • 没有找到相关文章

最新更新