假设我有这样一个文件:
13.03.2013 12:13:01 | STRING1 | NUMBER1 | 1 | NUMBER3
12 13.03.2013 12:13:08 | STRING1 | NUMBER1 | | NUMBER3
13 13.03.2013 12:13:09 | STRING3 | NUMBER1 | | NUMBER3
21 13.03.2013 12:13:12 | STRING2相等| NUMBER1 | | NUMBER3
13.03.2013 12:13:15 | STRING2相等| NUMBER1 | 11 | NUMBER3
13 13.03.2013 12:13:18 | STRING1 | NUMBER1 | | NUMBER3
21 13.03.2013 12:13:20 | STRING2相等| NUMBER1 | | NUMBER3
51 13.03.2013 12:13:25 | STRING3 | NUMBER1 | | NUMBER3
13.03.2013 12:13:38 | STRING2相等| NUMBER1 | 71 | NUMBER3
21 13.03.2013 12:13:40 | STRING1 | NUMBER1 | | NUMBER3
13.03.2013 12:13:42 | STRING1 | NUMBER1 | 11 | NUMBER3
13.03.2013 12:13:55 | STRING3 | NUMBER1 | 71 | NUMBER3
13.03.2013 12:14:02 | STRING1 | NUMBER1 | 11 | NUMBER3
13 13.03.2013 12:14:07 | STRING1 | NUMBER1 | | NUMBER3
13 13.03.2013 12:14:08 | STRING3 | NUMBER1 | | NUMBER3
21 13.03.2013 12:14:15 | STRING2相等| NUMBER1 | | NUMBER3
13.03.2013 12:14:16 | STRING2相等| NUMBER1 | 11 | NUMBER3
13.03.2013 12:14:16 | STRING1 | NUMBER1 | 1 | NUMBER3
21 13.03.2013 12:14:20 | STRING2相等| NUMBER1 | | NUMBER3
51 13.03.2013 12:14:25 | STRING3 | NUMBER1 | | NUMBER3
13.03.2013 12:14:37 | STRING2相等| NUMBER1 | 71 | NUMBER3
13.03.2013 12:14:42 | STRING1 | NUMBER1 | 1 | NUMBER3
13.03.2013 12:14:45 | STRING1 | NUMBER1 | 11 | NUMBER3
51 13.03.2013 12:14:58 | STRING3 | NUMBER1 | | NUMBER3
13.03.2013 12:15:06 | STRING2相等| NUMBER1 | 11 | NUMBER3
43 13.03.2013 12:15:13 | STRING1 | NUMBER1 | | NUMBER3
21 13.03.2013 12:15:22 | STRING2相等| NUMBER1 | | NUMBER3
51 13.03.2013 12:15:26 | STRING3 | NUMBER1 | | NUMBER3
13.03.2013 12:15:35 | STRING2相等| NUMBER1 | 71 | NUMBER3
13.03.2013 12:15:40 | STRING1 | NUMBER1 | 1 | NUMBER3
21 13.03.2013 12:15:42 | STRING1 | NUMBER1 | | NUMBER3
13.03.2013 12:15:53 | STRING3 | NUMBER1 | 71 | NUMBER3
我想找到第4列(在第三个|
之后)每分钟的平均值,仅针对变量X
。例如,如果$X="STRING1"
,结果应该是:
13.03.2013 12:13 | STRING1 | 11.6
13.03.2013 12:14 | STRING1 | 7.4
13.03.2013 12:15 | STRING1 | 21.666
因此,我们用变量$X
查找每分钟的线,并计算这些线的平均值。如何处理?
您可以使用以下awk程序:
example.awk :
$0 ~ SEARCH {
split($1,time,":")
min=time[2]
total[min]+=$4
count[min]++
ts[min]=time[1]":"time[2]
}
END{
for(m in total){
printf "%s|%s|%sn", ts[m],SEARCH,total[m]/count[m]
}
}
执行它:
awk -F'|' -v SEARCH=STRING1 -f example.awk your.log
输出:13.03.2013 12:13|STRING1|11.6
13.03.2013 12:14|STRING1|7.4
13.03.2013 12:15|STRING1|21.6667
awk -v X="STRING1" '
BEGIN { FS = OFS = "|" }
$2 != X {next}
{min = substr($1,1,16)}
min != prev {
if (NR>1) print prev, X, total/n
total = n = 0
prev = min
}
{n++; total += $4}
END {print prev, X, total/n}
' file