按天拆分 ~200mb log4j 日志文件



我有一个格式化如下的日志文件,我想按天将其拆分为多个文件(即.log-2017-10-2,log-2017-10-3等)。我见过人们用awk来做这件事,但我不确定如何处理堆栈跟踪,因为java.io.Exception是一个新行。有什么方便的方法可以实现这一点吗?

2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX
2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX
2017-10-04 04:26:02,544 INFO XXXXXXXXX
2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
java.io.IOException: Connection to X was disconnected before the response was read
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
2017-10-05 04:26:02,549 INFO XXXXXXXXXXX

最终文件内容将是:

log-2017-10-2:
2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX

log-2017-10-3:
2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX
log-2017-10-4:
2017-10-04 04:26:02,544 INFO XXXXXXXXX
2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
java.io.IOException: Connection to X was disconnected before the response was read
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
log-2017-10-5:
2017-10-05 04:26:02,549 INFO XXXXXXXXXXX

awk救援!

$ awk --posix 'BEGIN{f="log-header"} 
$1~/^[0-9]{4}-[0-9]{2}-[0-9]{2}$/{f="log-"$1} {print > f}' log

如果日期太多(对应于打开的文件太多),您可能需要在某一时刻关闭文件。 对于几百人来说,它应该按原样工作。

设置初始日志文件(日志标头),以防日志未以已检查的正则表达式开头。

awk解决方案:

awk '/^[0-9]{4}-[0-9]{2}-[0-9]{2} /{ 
if (fn && !a[$1]++) close(fn);
fn="log-"$1 
}{ print > fn }' logfile
  • /^[0-9]{4}-[0-9]{2}-[0-9]{2} /- 遇到以日期字符串开头的行
  • if(fn && !a[$1]++) close(fn)- 关闭上一个"日期"的上一个打开的文件描述符
  • fn="log-"$1- 构造文件名

查看结果:

$ head log-*
==> log-2017-10-02 <==
2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX
==> log-2017-10-03 <==
2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX
==> log-2017-10-04 <==
2017-10-04 04:26:02,544 INFO XXXXXXXXX
2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
java.io.IOException: Connection to X was disconnected before the response was read
&XXXXXXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXX
==> log-2017-10-05 <==
2017-10-05 04:26:02,549 INFO XXXXXXXXXXX

最新更新