我有一个包含5列的文件,格式如下:
$cat test.txt
id;section;name;val1;val2
11;10;John;50;15
12;20;Sam;40;20
13;30;Jeny;30;30
14;10;Ted;60;10
15;10;Mary;30;5
16;20;Tim;15;15
17;30;Pen;20;100
我想根据传递的section_number(第2列)处理文件中的数据。我想显示传递的section_id的id、Name、Total(column4+column5)。最后,我想打印总额最高的行信息。
我已经做出了如下的awk命令:
section=10 ; awk -F";" -v var="$section" 'BEGIN { print "id Name Total" } { if ($2 == var) { sum = $4 + $5 ;print $1 " "$3 " " sum ;if (sum>newsum) {newsum=sum;name=$3;id=$1}}} END { print "Max sum for section "var" is "newsum " for Name: " name " and ID: " id }' test.txt;
它显示的数据如下:
id Name Total
11 John 65
14 Ted 70
15 Mary 35
Max sum for section 10 is 70 for Name: Ted and ID: 14
但是,如果有多个记录的最高值与Total相同,该如何处理这种情况?
我想这完全取决于你想如何处理它?您可以通过使用数组说第一个get先于>
,最后一个>=
,或者两者都有。
假设你想显示所有人都有相同的共享最高金额:
% cat script.awk
BEGIN {
FS=";";
print "id Name Total";
}
$2 != var {next} # If line doesn't match skip blocks
{
sum = $4 + $5;
print $1 " " $3 " " sum;
}
sum > max { # If sum > max we need to reset the arrays (names and ids)
max = sum; # because we get a new winner
delete names;
delete ids;
l = 0;
}
sum >= max { # If sum is same or higher than max we will need to add this
l++; # to the list of winners.
names[l] = $3;
ids[l] = $1;
}
END {
printf "Max sum for section %s is %d forn", var, max;
# Iterate though all "winners" and print them
for ( i = 1; i <= l; i++ ) {
printf "Name: %s, ID: %sn", names[i], ids[i];
}
}
希望这能让你了解如何使用数组。
正在运行:
section=10;
awk -F";" -v var="$section" -f script.awk test.txt
# ^ Instead of having awk on command line use script.awk