我有一个文本文件,其中有以下格式的数据。如何仅打印日期距当前日期少于10天的行?
sample.txt
system system_data8 Thu Jul 29 22:36:38 2021
system system_data9 Wed Jan 24 14:43:52 2018
system system_data3 Tue Jan 23 20:25:17 2018
system system_data2 Fri Mar 09 20:37:05 2018
system system_data5 Fri Mar 09 22:02:31 2018
预期输出
system system_data8 Thu Jul 29 22:36:38 2021
我正在尝试下面的东西,但它不工作。
awk -F ' ' '{printf("%s,%s,",$1,$2);"date +%F -d "$3" "$4" "$5" "$6" "$7;}' sample.txt
GNUawk
具有时间函数:
$ gawk -v days=10 'BEGIN {max = days*86400; now = systime()}
NF>3 {
mn = (index("JanFebMarAprMayJunJulAugSepOctNovDec",$(NF-3)) + 2)/3
dt = $NF " " mn " " $(NF-2) " " gensub(/:/," ","g",$(NF-1))
diff = now - mktime(dt)
if (-max < diff && diff < max)
print
}' file
system system_data8 Thu Jul 29 22:36:38 2021
使用GNU awk实现时间函数:
$ cat tst.awk
BEGIN {
tgtDays = 10
tgtSecs = tgtDays * 24 * 60 * 60
endTime = strftime("%Y %m %d 12 00 00")
endSecs = mktime(endTime,1)
}
{
mthNr = (index("JanFebMarAprMayJunJulAugSepOctNovDec",$4)+2)/3
begTime = sprintf("%04d %02d %02d 12 00 00", $7, mthNr, $5)
begSecs = mktime(begTime,1)
}
(endSecs - begSecs) < tgtSecs
$ awk -f tst.awk sample.txt
system system_data8 Thu Jul 29 22:36:38 2021
请注意,在上面我们将输入数据中的时间和当前时间都替换为正午,因为在确定两个日期之间有多少天时,首先将时间戳转换为从epoch开始的秒数,然后除以一天中的秒数,您必须每天使用相同的时间,否则您的"天数";计算可以/将被每天的时间所抛弃。
例如,看看下面的例子,它试图确定两个相隔10天的日期是否小于10天:
$ cat diffDatesDemo.awk
BEGIN {
tgtDays = 10
tgtSecs = tgtDays * 24 * 60 * 60
begTime = "2021/08/01 09:00:00"
endTime = "2021/08/11 08:00:00"
begDate = gensub(/([ :][0-9]{2}){3}$/,"",1,begTime)
endDate = gensub(/([ :][0-9]{2}){3}$/,"",1,endTime)
print "Is", begTime, "less than", tgtDays, "days before", endTime "?"
####
print "nWrong: Compare 2 timestamps including date plus time of day:"
begSecs = mktime(gensub("[/:]"," ","g",begTime),1)
endSecs = mktime(gensub("[/:]"," ","g",endTime),1)
print begDate, "->", endDate, "is", ((endSecs - begSecs) < tgtSecs ? "<" : ">="), tgtDays, "days"
####
####
print "nRight: Compare 2 dates at the same time each day:"
begSecs = mktime(gensub("[/:]"," ","g",begDate)" 12 00 00",1)
endSecs = mktime(gensub("[/:]"," ","g",endDate)" 12 00 00",1)
print begDate, "->", endDate, "is", ((endSecs - begSecs) < tgtSecs ? "<" : ">="), tgtDays, "days"
####
}
$ awk -f diffDatesDemo.awk
Is 2021/08/01 09:00:00 less than 10 days before 2021/08/11 08:00:00?
Wrong: Compare 2 timestamps including date plus time of day:
2021/08/01 -> 2021/08/11 is < 10 days
Right: Compare 2 dates at the same time each day:
2021/08/01 -> 2021/08/11 is >= 10 days
我还在上面的mktime()
中使用了UTC标志,以确保任何本地DST更改都不会影响计算的天数。
如果您的date
实用程序是最近的,如果您的shell是bash
,则只能使用bash
和date
:
now=$(date +%s)
while read sys sys_d dt; do
sec=$(date -d "$dt" +%s)
if (( now-sec <= 10*24*3600 )); then
echo "$sys $sys_d $dt"
fi
done < sample.txt
注意,比较是在将当前日期/时间转换为UNIX时间戳(
date +%s
)之后进行的,即从1970-01-01开始的秒数。文件中的日期/时间也一样。这并没有考虑到日期/时间的不规则性,比如夏令时或闰秒。所以,取决于你什么时候运行这个,取决于你的文件的内容,以及你对10天的定义,结果可能是你想要的,也可能不是你想要的。