我有一个制表符分隔的文件:abc.txt。其数据类似于:
Pytul_T015270 Protein of unknown function
Pytul_T015269 Protein of unknown function
Pytul_T015255 Protein of unknown function
Pytul_T015297 Protein of unknown function
我正在创建一个解析器,它将这个abc.txt和其他两个文件作为输入,并通过从包中调用不同的子程序来解析这些文件:utility.pm
解析abc.txt
的子程序在我的包中定义,utility.pm
如下所示:
use strict;
sub readblast{
my $fileName = shift;
my %hash;
my %geneNameHash;
open PRED, $fileName or die "Can't open file $!n";
while (my $line=<PRED>) {
chomp $line;
#print $line,"n";
(my $gene,my $desc) = split /t/, $line;
$hash{$gene} = $desc;
}
close(PRED);
return %hash;
}
我的parser.pl脚本使用散列如下:
my %blast=&utility::readblast($ARGV[2]);
for my $mRNA(keys %{ $featureHash{$scaffold}{$gene}}){
my $desc = $blast{$mRNA};
}
这里$featurehash
是我从另一个文件中生成的另一个散列。并且CCD_ 5具有文件CCD_。
但是$desc的输出是空白的,我得到了错误:
Use of uninitialized value $desc in concatenation (.) or string at parser.pl
my $desc = $blast{$mRNA};
有什么问题?为什么它不存储abc.txt的第二列?
以下保护措施防止尾随空行和可能的非制表符分隔符(通过使用带限制的split
):
#!/usr/bin/env perl
package My::Utility;
use strict;
use warnings;
sub read_blast {
my $fh = shift;
my %hash;
while (my $line = <$fh>) {
chomp $line;
last unless $line =~ /S/;
my ($key, $value) = split ' ', $line, 2;
$hash{ $key } = $value;
}
return %hash;
}
package main;
my $blast = My::Utility::read_blast(*DATA);
while (my ($k, $v) = each %$blast) {
print "'$k' => '$v'n";
}
__DATA__
Pytul_T015270 Protein of unknown function
Pytul_T015269 Protein of unknown function
Pytul_T015255 Protein of unknown function
Pytul_T015297 Protein of unknown function