AWK 打印字符串+bash 变量+字符串的组合



我正在尝试使用隔离 ID 重命名 fasta 文件中的重叠群,并使用 awk 将重叠群从 1 编号为 n。

快速文件:

>NODE_1_length_172477_cov_46.1343
GCAGGGCGCAGTTTTTGGAGGCTTGGCAAACCCGTGAGGGAAATTTGGCAGGCAAAATTT
TGGCGGTCGTGCCGAAAAAAGCGGAGGCGATTTCAAATAAATTGTTTTTCACACATCATC
CCAAGCGGCAGACGGAGTTTGCAGTCGGACAAATCAGGCAAGGGCGCGCAGAGTAAGTCA

隔离 ID 是一个变量,因为我要为多个文件执行此操作。我已经让它打印隔离ID号,但我需要>isolateID_number

for file in /dir/*.fasta
do
name=$(basename "$file" .fasta)
awk '/^>/{print "'"$name"'" ++i; next}{print}' $file > rename.fasta
done;

这给了我:

15AR07771
GCAGGGCGCAGTTTTTGGAGGCTTGGCAAACCCGTGAGGGAAATTTGGCAGGCAAAATTT
TGGCGGTCGTGCCGAAAAAAGCGGAGGCGATTTCAAATAAATTGTTTTTCACACATCATC
CCAAGCGGCAGACGGAGTTTGCAGTCGGACAAATCAGGCAAGGGCGCGCAGAGTAAGTCA

期望输出:

>15AR0777_1
GCAGGGCGCAGTTTTTGGAGGCTTGGCAAACCCGTGAGGGAAATTTGGCAGGCAAAATTT
TGGCGGTCGTGCCGAAAAAAGCGGAGGCGATTTCAAATAAATTGTTTTTCACACATCATC
CCAAGCGGCAGACGGAGTTTGCAGTCGGACAAATCAGGCAAGGGCGCGCAGAGTAAGTCA

问题是,我把字符串放在哪里,以便它打印>15AR0777_1而不是 15AR07771

我尝试了以下几种变体,但没有一种奏效

awk '/^>/{print ">'"$name"'" "_" ++i; next}{print}' $file > rename.fasta
awk '/^>/{print ">'"$name"'" _++i; next}{print}' $file > rename.fasta

谢谢!

使用awk -v awk_var="$bash_bar"将 shell 变量传输到 awk 脚本中。man awk:

-v var=val
--assign var=val
Assign the value val to the variable var, before execution of the program begins.  Such variable values are available to the
BEGIN rule of an AWK program.

即:

for file in dir/*.fasta
do         
name=$(basename "$file" .fasta)
awk -v name="$name" '/^>/{print ">" name "_" ++i; next}{print}' $file > rename.fasta
done

这是一个全 awk 版本:

awk '
FNR==1 {                         # new file, close old and make name for new
close(f)                     # close the old output file
n=FILENAME                   # get filename of the new file
gsub(/^.*/|.fasta$/,"",n)  # remove path and .fasta
f="rename_" n ".fasta"       # new output file
}
/^>/ {
$0=">" n "_" ++i             # >name_number
}
{
print > f                    # print to output file
}' dir/*.fasta                   # process .fasta files in dir

如果存在文件dir/15AR07771.fasta脚本将生成一个文件./rename_15AR07771.fasta该文件。(您的版本将所有输出文件写入rename.fasta甚至不追加,您可能需要修复它。

最新更新