我有一个文件1
NR2SKRD12BWP210H6P51CNODSVT-(.A1(n7),.A2(),.ZN(n8)); |2
MUX2D2BWP210H6P51CNODSVT-(.I0(n8),.I1(),.S(),.Z(n9)); |4
CKLHQD16BWP210H6P51CNODSVT-(.CPN(#),.E(1),.TE(n9),.Q(n10)); |5
LHCSNQD1BWP210H6P51CNODSVT-(.CDN(n10),.D(),.E(1),.SDN(),.Q(n11)); |6
OAI21D8BWP210H6P51CNODSVT-(.A1(n11),.A2(),.B(),.ZN(n12)); |9
DCCKND16BWP210H6P51CNODSVT-(.I(n12),.ZN(n13)); |10
INVSKFD14BWP210H6P51CNODSVT-(.I(n13),.ZN(n14)); |11
NR2SKRD12BWP210H6P51CNODSVT-(.A1(n7),.A2(n1),.ZN(n8)); |2
MUX2D2BWP210H6P51CNODSVT-(.I0(n8),.I1(n2),.S(),.Z(n9)); |4
我需要在文件1中找到匹配的行,即第一个字段。字段由-
分隔,如果找到匹配项,则删除第一个匹配线。
我希望输出为
CKLHQD16BWP210H6P51CNODSVT-(.CPN(#),.E(1),.TE(n9),.Q(n10)); |5
LHCSNQD1BWP210H6P51CNODSVT-(.CDN(n10),.D(),.E(1),.SDN(),.Q(n11)); |6
OAI21D8BWP210H6P51CNODSVT-(.A1(n11),.A2(),.B(),.ZN(n12)); |9
DCCKND16BWP210H6P51CNODSVT-(.I(n12),.ZN(n13)); |10
INVSKFD14BWP210H6P51CNODSVT-(.I(n13),.ZN(n14)); |11
NR2SKRD12BWP210H6P51CNODSVT-(.A1(n7),.A2(n1),.ZN(n8)); |2
MUX2D2BWP210H6P51CNODSVT-(.I0(n8),.I1(n2),.S(),.Z(n9)); |4
这里NR2SKRD12BWP210H6P51CNODSVT
和MUX2D2BWP210H6P51CNODSVT
具有相同的$1。所以删除他们的第一条比赛线。
我试过代码
awk -F'-' 'FNR==NR{a[$1];next} !(($1) in a)' file1
但这段代码是为了查找两个文件之间的匹配行并删除行。如何查找单个文件的匹配和删除。*仅删除第一条匹配线。保持第二次、第三次、第四次等的重复。
您能尝试以下内容吗?这些内容是用GNUawk
中显示的示例编写和测试的。
awk '
BEGIN{ FS="-" }
FNR==NR{
arr[$1]++
next
}
arr[$1]>1 && ++arrAgain[$1]==1{ next }
1
' Input_file Input_file
解释:添加以上详细解释。
awk ' ##Starting awk program from here.
BEGIN{ FS="-" } ##Setting field separator as dash here.
FNR==NR{ ##Checking FNR==NR condition which will be TRUE when 1st time Input_file is being read.
arr[$1]++ ##Creating array arr with 1st field index and keep increasing its value with 1 on each of its occurrence.
next ##next will skip all further statements from here.
}
arr[$1]>1 && ++arrAgain[$1]==1{ ##Checking if arr value with 1st field index is greater than 1 and its first time occurring in arrAgain then skip that line.
next ##next will skip all further statements from here.
}
1 ##1 will print current line.
' Input_file Input_file ##Mentioning Input_file names here.
这可能对你有用(GNU sed(:
sed -E 'H;x;s/^(n[^-]*-)[^n]*(.*1)/2/;x;$!d;x;s/.//' file
在保留空间中复制当前行。
如果当前键已存在于保留空间中,请删除第一行。
在文件末尾,切换到保留空间,删除复制时引入的第一行换行符,然后打印结果。
另一个awk
$ awk -F- 'NR==FNR{a[$1]++; next} !(--a[$1])' file{,}
CKLHQD16BWP210H6P51CNODSVT-(.CPN(#),.E(1),.TE(n9),.Q(n10)); |5
LHCSNQD1BWP210H6P51CNODSVT-(.CDN(n10),.D(),.E(1),.SDN(),.Q(n11)); |6
OAI21D8BWP210H6P51CNODSVT-(.A1(n11),.A2(),.B(),.ZN(n12)); |9
DCCKND16BWP210H6P51CNODSVT-(.I(n12),.ZN(n13)); |10
INVSKFD14BWP210H6P51CNODSVT-(.I(n13),.ZN(n14)); |11
NR2SKRD12BWP210H6P51CNODSVT-(.A1(n7),.A2(n1),.ZN(n8)); |2
MUX2D2BWP210H6P51CNODSVT-(.I0(n8),.I1(n2),.S(),.Z(n9)); |4
双扫描文件,第一轮统计每个键的出现次数,第二轮只打印最后一个。
要删除第一个重复项:
awk -F- 'NR==FNR {++a[$1]; next} a[$1]==1; {a[$1]=1}' file file
读取同一文件两次。第一次读时数1美元,决定下一次该怎么算。