我有一个这样的问题:
文件 1(非常大的文件(:
fid rsid activity
1 rs111 we drink
2 rs112 we drink
3 rs113 we eat
4 rs114 we are happy
5 rs115 we eat
...
文件2(活动分类(:
we drink 1
we eat 2
we are happy 3
others 4
...
我想使用活动代码替换(或生成另一列(活动名称,以获得类似
fid rsid code activity
1 rs111 1 we drink
2 rs112 1 we drink
3 rs113 2 we eat
4 rs114 3 we are happy
5 rs115 2 we eat
...
请问如何使用Unix命令(awk例如(来做到这一点?
非常感谢!埃里克
一般来说,人们会使用join
来解决这类问题,但要连接的"字段"有点复杂。使用 awk
,您将首先读取"活动"文件以构建字典,然后读取"大文件",在进行时插入活动代码:
awk 'NR == FNR { val = $NF; $NF=""; sub(/ *$/, "", $0); a[$0] = val; next; }
{ fid = $1; rsid = $2; $1 = ""; $2 = ""; sub(/^ */, "", $0);
print fid, rsid, a[$0], $0 }' activities bigfile
根据需要插入标题。