使用AWK筛选CSV文件



我有一个CSV文件,里面有很多数据。我想有选择地过滤它,我尝试了这个:

#!/bin/bash
cat my_file.csv | while read line
do
     awk_variables=`echo "$line" | awk -F, '( $13 == "*OUT*" ) || ( $13 == "*IN*" ) {print $1, $5, $10, $12, $13}'`
     echo "$awk_variables" >> my_file.csv
done

数据片段:

1427783525,e,gf,StackExchange,HDCU3000623,d,scan,211,47969,60,1420739235,Sensor,Module,v1.06,1,*IN*
1426661348,e,gf,StackExchange,HDCU3000623,d,scan,197,48066,3,1420703835,Sensor,Module,v1.06,2,*OUT*
1426661355,e,gf,StackExchange,HDCU3000623,d,scan,197,54949,4,1420703822,*OUT*
1426661362,e,gf,StackExchange,HDCU3000623,d,scan,197,61971,5,1420703835,Sensor,Module,v1.06,4,*OUT*
1426661369,e,gf,StackExchange,HDCU3000623,d,scan,197,68947,6,1420705615,*OUT*
1426661376,e,gf,StackExchange,HDCU3000623,d,scan,197,75948,7,1420706218,*OUT*
1426661383,e,gf,StackExchange,HDCU3000623,d,scan,197,82948,8,1420707784,*OUT*
1426661390,e,gf,StackExchange,HDCU3000623,d,scan,197,89947,9,1420707801,*OUT*
1426661397,e,gf,StackExchange,HDCU3000623,d,scan,197,96969,10,1420708035,Sensor,Module,v1.06,9,*OUT*
1426740345,e,gf,StackExchange,HDCU3000623,d,scan,198,47971,11,1420708635,Sensor,Module,v1.06,1,*OUT*
1426740352,e,gf,StackExchange,HDCU3000623,d,scan,198,54964,12,1420708646,H11HDCU3000623,*OUT*
1426740359,e,gf,StackExchange,HDCU3000623,d,scan,198,61963,13,1420708647,H11HDCU3000623,*OUT*
1426740366,e,gf,StackExchange,HDCU3000623,d,scan,198,68963,14,1420708648,H11HDCU3000623,*OUT*
1426740379,e,gf,StackExchange,HDCU3000623,d,scan,198,82948,15,1420708700,*OUT*
1426740493,e,gf,StackExchange,HDCU3000623,d,status,199,44000,Run,#199.,Scans,to,date:,0
1426740497,e,gf,StackExchange,HDCU3000623,d,scan,199,47971,16,1420708635,Sensor,Module,v1.06,1,*OUT*
1426740504,e,gf,StackExchange,HDCU3000623,d,scan,199,54963,17,1420708649,H11HDCU3000623,*OUT*
1426740700,e,gf,StackExchange,HDCU3000623,d,scan,199,250950,18,1420708871,H07W770275,*OUT*
1426740710,m,gf,TMX6BP,075,d,SVlts,288,33604,27352,27352,948
1426740721,m,gf,TMX6BP,075,d,status,288,44000,183139,-33.836465,151.051189
1426740721,e,gf,StackExchange,HDCU3000623,d,scan,199,271941,19,1420708887,H07W770275,*OUT*
1426740728,e,gf,StackExchange,HDCU3000623,d,scan,199,278941,20,1420708888,H07W770275,*OUT*

问题是,当我打开my_file.csv文件时,数据会按照我希望的方式进行过滤-有很多空行。这些空行是该行不符合awk条件的结果。如何修改上面的代码,以便从第一行开始写入过滤后的数据?

所以我的输出是:

1426740504 HDCU3000623 17 H11HDCU3000623 *OUT*
1426740700 HDCU3000623 18 H07W770275 *OUT*

1426740721 HDCU3000623 19 H07W770275 *OUT*



1426740728 HDCU3000623 20 H07W770275 *OUT*

echo每次使用时都会打印一行新行。最好不要逐行使用awk,而是像这样使用:

awk -F, '($13=="*OUT*")||($13=="*IN*"){print $1,$5,$10,$12,$13}' jacon_mqtt.csv > my_file.csv