我意识到有很多awk NR=FNR问题和答案。我曾经取得过成功,但缺少一块。我需要根据列1(guid(匹配将特定字段从file1.csv更新为file2.csv。这还可以。我唯一的问题是,我还需要包括文件1中不存在于文件2中的记录。如何做到这一点?
file1.csv
guid,dyu,source,speech,french,hint1,morph1,morph2
z[fOXUyQT/,ɲyànafin,bk,adj,,,,
y1Ay6~2:)X,cɛ ka ɲi,bk,exp,joli,,,
993e663e-a3e4-4b7a-ba4a-1f4e982e4bf1,yɔ́rɔ,bk,n,endroit,,yɔrɔ,
f920d572-8185-41a1-8876-38940ea58604,bádenya,,,"fraternité, amitié, intime",,badenya,
"AM@~Gc,^|!",dencɛ,bk,,garçon / fils,,,
file2.csv
guid,dyu,source,speech,french,hint1,morph1,morph2
z[fOXUyQT/,ɲyànafin,yd,,nostalgie,,ɲyanafin,
y1Ay6~2:)X,cɛ ka ɲi,yq,,joli,,,
所需输出
guid,dyu,source,speech,french,hint1,morph1,morph2
z[fOXUyQT/,ɲyànafin,bk,adj,nostalgie,,ɲyanafin,
y1Ay6~2:)X,cɛ ka ɲi,bk,exp,joli,,,
993e663e-a3e4-4b7a-ba4a-1f4e982e4bf1,yɔ́rɔ,bk,n,endroit,,yɔrɔ,
到目前为止:(这会更新字段,但不会添加丢失的记录(
awk -F"," 'BEGIN{OFS=","; FPAT = "([^,]*)|("[^"]+")"}
{
if (NR==FNR) {
guid[$1]=$1;
a3[$1]=$3;
a4[$1]=$4;
next
}
{
if ($1 in guid)
{
$3 = a3[$1];
$4 = a4[$1];
}
print
}
}' file1.csv file2.csv
非常感谢您的任何建议。
假设两个文件中都没有任何重复的$1
值,如问题中的示例所示:
$ cat tst.awk
BEGIN { FS=OFS="," }
NR==FNR {
file1rec[$1] = $0
next
}
$1 in file1rec {
split(file1rec[$1],file1flds)
$3 = file1flds[3]
$4 = file1flds[4]
delete file1rec[$1]
}
{ print }
END {
for (key in file1rec) {
print file1rec[key]
}
}
$ awk -f tst.awk file1.csv file2.csv
guid,dyu,source,speech,french,hint1,morph1,morph2
z[fOXUyQT/,ɲyànafin,bk,adj,nostalgie,,ɲyanafin,
y1Ay6~2:)X,cɛ ka ɲi,bk,exp,joli,,,
993e663e-a3e4-4b7a-ba4a-1f4e982e4bf1,yɔ́rɔ,bk,n,endroit,,yɔrɔ,
在将包含逗号的引用字段添加到示例中之前,已经完成了上述操作。如问题中的脚本所示,从FS更改为FPAT以适应这一点。