我想根据每个文件的两列值来匹配两个文件。如果";BP";以及";P〃;在同一行中匹配,我想将这些行打印在第三个文件上,就像文件2一样。
文件1:
CHR BP BETA SE P PHENOTYPE FDR CATEGORY SNP
10 110408937 3.386e+00 1.333e+00 1.112e-02 1 1 Medication rs113627704
10 110408937 4.409e+00 1.623e+00 6.602e-03 2 1 Cardiovascular rs113627704
10 110408937 2.382e+00 1.124e+00 3.414e-02 3 1 Medication rs113627704
文件2:
CHR F SNP BP P TOTAL
10 1 rs113627704 110408937 1.112e-02 456
4 1 rs43567 2345677 0.045457 567
3 1 rs567899 479899 0.3456 223
期望输出:
CHR BP BETA SE P PHENOTYPE FDR CATEGORY SNP
10 110408937 3.386e+00 1.333e+00 1.112e-02 1 1 Medication rs113627704
我试过以下两种:
awk 'FNR==NR{a[$4,$5]=$0;next}{if(b=a[$2,$5]){print b}}' file1 file2 > file3
在这里我得到了错误";bash:awk:找不到命令"我一直在用awk,它总是有效的。
awk 'FNR==NR {a[$4,$5]=$0; next} ($4,$5) in a {print a[$2,$5], $0}' file1 file2 > file3
这里有一个空文件。
这应该有效:
$ awk 'NR==FNR{a[$4,$5]=$0;next}(($2,$5) in a)' file2 file1
输出:
CHR BP BETA SE P PHENOTYPE FDR CATEGORY SNP
10 110408937 3.386e+00 1.333e+00 1.112e-02 1 1 Medication rs113627704
解释:
$ awk '
NR==FNR { # process file2 as output we want are from file1
a[$4,$5]=$0 # desired fields are 4th and 5th, use them as hash key
next # move to next record
} # process file1 below this point
(($2,$5) in a) # test if 2nd and 5th in hash and output
' file2 file1 # mind the file order
您的命令中的单词awk
中有一些不可见的字符:
awk 'FNR==NR{a[$4,$5]=$0;next}{if(b=a[$2,$5]){print b}}' file1 file2 > file3
使用命令中的字符串:
$ type awk
-bash: type: awk: not found
手动键入awk
:
$ type awk
awk is hashed (/usr/bin/awk)