我的任务是处理一个文本文件,以便使用 Bash 仅检索相关详细信息。以下是文本文件的示例内容:
Jul 21 09:29:10 serverbkp dhcpd: DHCPDISCOVER from aa:bb:cc:dd:ee:ff via 1.2.3.188: peer holds all free leases
Jul 21 09:29:10 serverbkp dhcpd: DHCPDISCOVER from aa:bb:cc:dd:ee:ff via 1.2.3.189: peer holds all free leases
Jul 21 09:29:10 serverbkp dhcpd: DHCPDISCOVER from aa:bb:cc:dd:ee:gg via eth0: network 1.2.64.0/24: no free leases
Jul 21 09:29:10 serverbkp dhcpd: DHCPDISCOVER from aa:bb:cc:dd:ee:gg via eth0: network 1.2.65.0/24: no free leases
我尝试阅读每一行,测试它包含字符串的条件peer 持有所有或没有免费租约。基于字符串(包含的行),我将通过检索字符串的一部分并将其推送到数组中来进一步处理它。
while IFS= read -r line;
do
if [[ $line == *"peer holds all"* ]]; then
readarray -t peer_holds_array < <(echo "${line}" | awk '{print $10}' | sed -e 's/:$//g')
elif [[ $line == *"no free leases"* ]]; then
readarray -t no_free_leases_array < <(echo "${line}" | awk '{print $12}' | sed -e 's/:$//g')
fi
done < <(grep -i "peer holds all|no free leases" daemon.log)
peer_holds_uniq=($(printf "%sn" "${peer_holds_array[@]}" | sort -u))
no_free_lease_uniq=($(printf "%sn" "${no_free_lease_array[@]}" | sort -u))
printf "Peer Holds Leases - Via:n"
printf "${peer_holds_uniq[@]}n"
printf "No Free Leases:n"
printf "${no_free_lease_uniq[@]}n"
预期成果:
Peer Holds Leases - Via:
1.2.3.188
1.2.3.189
No Free Leases:
1.2.64.0/24
1.2.65.0/24
实际结果:
Peer Holds Leases - Via:
1.2.3.188
No Free Leases:
1.2.64.0/24
一个有效的实现可能如下所示:
#!/usr/bin/env bash
case $BASH_VERSION in ''|[1-3]*) echo "ERROR: Bash 4.0 or newer is needed" >&2; exit 1;; esac
generate_input() { # so this can be run by people without your real input file
cat <<'EOF'
Jul 21 09:29:10 serverbkp dhcpd: DHCPDISCOVER from aa:bb:cc:dd:ee:ff via 1.2.3.188: peer holds all free leases
Jul 21 09:29:10 serverbkp dhcpd: DHCPDISCOVER from aa:bb:cc:dd:ee:ff via 1.2.3.189: peer holds all free leases
Jul 21 09:29:10 serverbkp dhcpd: DHCPDISCOVER from aa:bb:cc:dd:ee:gg via eth0: network 1.2.64.0/24: no free leases
Jul 21 09:29:10 serverbkp dhcpd: DHCPDISCOVER from aa:bb:cc:dd:ee:gg via eth0: network 1.2.65.0/24: no free leases
EOF
}
set -x # enable debug logging
peer_holds_re=' via ([[:digit:].]+): peer holds all' # define regular expressions
no_free_leases_re='network ([[:digit:]/.]+): no free leases'
declare -A peer_holds_array=( ) no_free_lease_array=( ) # initialize associative arrays
while IFS= read -r line; do
if [[ $line =~ $peer_holds_re ]]; then # testing [[ $string =~ $re ]]
peer_holds_array[${BASH_REMATCH[1]}]=1 # ...sets ${BASH_REMATCH[@]} array
elif [[ $line =~ $no_free_leases_re ]]; then
no_free_lease_array[${BASH_REMATCH[1]}]=1
fi
done < <(generate_input | grep -Ei "peer holds all|no free leases")
printf "Peer Holds Leases - Via:n"
printf '%sn' "${!peer_holds_array[@]}"
printf "No Free Leases:n"
printf '%sn' "${!no_free_lease_array[@]}"
- 使用bash的内置正则表达式支持(
[[ $string =~ $regex ]]
)让我们不必担心一行分成多少个字段;它也比为每行输入启动echo | awk | sed
管道快数百倍。 - 我们切换到对数据使用关联数组的键,因为这些键本质上是唯一的。在这里,实际数据是键,与它们关联的数据只是设置为占位符常量(在本例中为
1
)。 readarray
覆盖整个目标数组,因此不能将其用于增量添加;array+=( "first item to append" "second item to append" )
用于常规数组;或者在这里,我们在关联数组中设置键,array["item to set"]=1
printf
需要一个格式字符串,它为满足该字符串中占位符的每组参数重复该字符串。因此,printf '%sn' 'First line' 'Second line'
First line
替换为%sn
中的%s
,并再次重复Second line
。
您可以看到它运行在 https://ideone.com/GmZYrV
对于使用常规数组的版本,请参阅此答案的编辑历史记录。
FWIW 以下是我的做法,使用 GNU awk for gensub() 和 sorted_in:
$ cat tst.awk
{ addr = gensub(/.* ([^:]+):.*$/,"\1",1) }
/peer holds all/ { peers[addr] }
/no free leases/ { frees[addr] }
END {
PROCINFO["sorted_in"] = "@ind_str_asc"
print "Peer Holds Leases - Via:"
for (addr in peers) {
print addr
}
print "No Free Leases:"
for (addr in frees) {
print addr
}
}
$ awk -f tst.awk file
Peer Holds Leases - Via:
1.2.3.188
1.2.3.189
No Free Leases:
1.2.64.0/24
1.2.65.0/24