我有两个文件1.xml和2.xml。文件是排序的,长度不同。我想用awk来比较和打印匹配和不匹配的行。
xml1.
AGPS=1_<Class>_AGPS -> allowedAudit == false
AGPS=1_<Class>_AGPS -> allowedAudit == false
AGPS=1_<Class>_AGPS -> horizontalAccuracy == 100
AGPS=1_<Class>_AGPS -> horizontalAccuracy == 50
AGPS=1_<Class>_AGPS -> id == 1
AGPS=1_<Class>_AGPS -> id == 2
AGPS=1_<Class>_AGPS -> ionosphericModelAllowed == true
AGPS=1_<Class>_AGPS -> maxNumGPSSatellites == 8
AGPS=1_<Class>_AGPS -> maxNumGPSSatellites == 8
AGPS=1_<Class>_AGPS -> maxUeBasedAGPSProcedureTime == 24
xml2.
AGPS=1_<Class>_AGPS -> allowedAudit == false
AGPS=1_<Class>_AGPS -> allowedAudit == true
AGPS=1_<Class>_AGPS -> horizontalAccuracy == 120
AGPS=1_<Class>_AGPS -> horizontalAccuracy == 50
AGPS=1_<Class>_AGPS -> id == 1
AGPS=1_<Class>_AGPS -> id == 3
AGPS=1_<Class>_AGPS -> ionosphericModelAllowed == true
AGPS=1_<Class>_AGPS -> maxNumGPSSatellites == 8
使用代码
awk -F"==" 'FNR==NR { array1[$1]=$2;array[$2]=$2; next } { print ($2 in array ? $0 : $0" "array1[$1]" ""NM"), array[$2] }' 2.xml 1.xml
输出AGPS=1_<Class>_AGPS -> allowedAudit == false false
AGPS=1_<Class>_AGPS -> allowedAudit == false false
AGPS=1_<Class>_AGPS -> horizontalAccuracy == 100 50 NM
AGPS=1_<Class>_AGPS -> horizontalAccuracy == 50 50
AGPS=1_<Class>_AGPS -> id == 1 1
AGPS=1_<Class>_AGPS -> id == 2 3 NM
AGPS=1_<Class>_AGPS -> ionosphericModelAllowed == true true
AGPS=1_<Class>_AGPS -> maxNumGPSSatellites == 8 8
AGPS=1_<Class>_AGPS -> maxNumGPSSatellites == 8 8
AGPS=1_<Class>_AGPS -> maxUeBasedAGPSProcedureTime == 24 NM
预期输出
AGPS=1_<Class>_AGPS -> allowedAudit == false false
AGPS=1_<Class>_AGPS -> allowedAudit == false false
AGPS=1_<Class>_AGPS -> horizontalAccuracy == 100 120 NM
AGPS=1_<Class>_AGPS -> horizontalAccuracy == 50 50
AGPS=1_<Class>_AGPS -> id == 1 1
AGPS=1_<Class>_AGPS -> id == 2 3 NM
AGPS=1_<Class>_AGPS -> ionosphericModelAllowed == true true
AGPS=1_<Class>_AGPS -> maxNumGPSSatellites == 8 8
AGPS=1_<Class>_AGPS -> maxNumGPSSatellites == 8 NF
AGPS=1_<Class>_AGPS -> maxUeBasedAGPSProcedureTime == 24 NF
不匹配的行需要额外的代码作为Not Found(NF)
对于一些不匹配的情况(NM),逻辑也会失败,但对于某些情况,它可以工作。
实际的文件很大,我只获得了部分成功。
awk
来救援!
$ awk -F' == ' 'NR==FNR {a[$1,++c[$1]]=$2; next}
{print $1 FS $2, v=a[$1,++d[$1]], (v!=$2)?"NM":"";
delete a[$1,d[$1]]}
END {for(k in a)
{split(k,ks,SUBSEP);
print ks[1] FS a[k],"NF"}}' file1 file2
AGPS=1_<Class>_AGPS -> allowedAudit == false false
AGPS=1_<Class>_AGPS -> allowedAudit == true false NM
AGPS=1_<Class>_AGPS -> horizontalAccuracy == 120 100 NM
AGPS=1_<Class>_AGPS -> horizontalAccuracy == 50 50
AGPS=1_<Class>_AGPS -> id == 1 1
AGPS=1_<Class>_AGPS -> id == 3 2 NM
AGPS=1_<Class>_AGPS -> ionosphericModelAllowed == true true
AGPS=1_<Class>_AGPS -> maxNumGPSSatellites == 8 8
AGPS=1_<Class>_AGPS -> maxNumGPSSatellites == 8 NF
AGPS=1_<Class>_AGPS -> maxUeBasedAGPSProcedureTime == 24 NF
通过添加计数器作为key的一部分来处理重复的key。匹配是通过协调计数器。最终,多余的记录被打印为NF。
p。记录2的预期输出是错误的。