我对 Perl 比较陌生,我遇到了这个项目,我遇到了一些困难。该项目的目标是比较两个csv文件,其中一个文件将包含:$name、$model$version另一个将包含:$name 2,$disk,$storage最后,RESULT 文件将包含匹配的行,并将信息放在一起,如下所示:$name、$model、$version、$disk,$storage。
我已经设法做到了这一点,但我的问题是,当缺少程序的元素之一中断时。当它遇到文件中缺少元素的行时,它会在该行处停止。如何解决此问题?关于我如何让它跳过那条线并继续前进的任何建议或方法?
这是我的代码:
open( TESTING, '>testing.csv' ); # Names will be printed to this during testing. only .net ending names should appear
open( MISSING, '>Missing.csv' ); # Lines with missing name feilds will appear here.
#open (FILE,'C:Usershp-laptopDesktopfile.txt');
#my (@array) =<FILE>;
my @hostname; #stores names
#close FILE;
#***** TESTING TO SEE IF ANY OF THE LISTED ITEMS BEGIN WITH A COMMA AND DO NOT HAVE A NAME.
#***** THESE OBJECTS ARE PLACED INTO THE MISSING ARRAY AND THEN PRINTED OUT IN A SEPERATE
#***** FILE.
#open (FILE,'C:Usershp-laptopDesktopfile.txt');
#test
if ( open( FILE, "file.txt" ) ) {
}
else {
die " Cannot open file 1!n:$!";
}
$count = 0;
$x = 0;
while (<FILE>) {
( $name, $model, $version ) = split(","); #parsing
#print $name;
chomp( $name, $model, $version );
if ( ( $name =~ /^s*$/ )
&& ( $model =~ /^s*$/ )
&& ( $version =~ /^s*$/ ) ) #if all of the fields are blank ( just a blank space)
{
#do nothing at all
}
elsif ( $name =~ /^s*$/ ) { #if name is a blank
$name =~ s/^s*/missing/g;
print MISSING "$name,$model,$versionn";
#$hostname[$count]=$name;
#$count++;
}
elsif ( $model =~ /^s*$/ ) { #if model is blank
$model =~ s/^s*/missing/g;
print MISSING"$name,$model,$versionn";
}
elsif ( $version =~ /^s*$/ ) { #if version is blank
$version =~ s/^s*/missing/g;
print MISSING "$name,$model,$versionn";
}
# Searches for .net to appear in field "$name" if match, it places it into hostname array.
if ( $name =~ /.net/ ) {
$hostname[$count] = $name;
$count++;
}
#searches for a comma in the name feild, puts that into an array and prints the line into the missing file.
#probably won't have to use this, as I've found a better method to test all of the feilds ( $name,$model,$version)
#and put those into the missing file. Hopefully it works.
#foreach $line (@array)
#{
#if($line =~ /^,+/)
#{
#$line =~s/^,*/missing,/g;
#$missing[$x]=$line;
#$x++;
#}
#}
}
close FILE;
for my $hostname (@hostname) {
print TESTING $hostname . "n";
}
#for my $missing(@missing)
#{
# print MISSING $missing;
#}
if ( open( FILE2, "file2.txt" ) ) { #Run this if the open succeeds
#open outfile and print starting header
open( RESULT, '>resultfile.csv' );
print RESULT ("name,Model,version,Disk, storagen");
}
else {
die " Cannot open file 2!n:$!";
}
$count = 0;
while ( $hostname[$count] ne "" ) {
while (<FILE>) {
( $name, $model, $version ) = split(","); #parsing
#print $name,"n";
if ( $name eq $hostname[$count] ) # I think this is the problem area.
{
print $name, "n", $hostname[$count], "n";
#print RESULT"$name,$model,$version,";
#open (FILE2,'C:Usershp-laptopDesktopfile2.txt');
#test
if ( open( FILE2, "file2.txt" ) ) {
}
else {
die " Cannot open file 2!n:$!";
}
while (<FILE2>) {
chomp;
( $name2, $Dcount, $vname ) = split(","); #parsing
if ( $name eq $name2 ) {
chomp($version);
print RESULT"$name,$model,$version,$Dcount,$vnamen";
}
}
}
$count++;
}
#open (FILE,'C:Usershp-laptopDesktopfile.txt');
#test
if ( open( FILE, "file.txt" ) ) {
}
else {
die " Cannot open file 1!n:$!";
}
}
close FILE;
close RESULT;
close FILE2;
我认为你想要下一个,它可以让您立即完成当前迭代并开始下一个迭代:
while (<FILE>) {
( $name, $model, $version ) = split(",");
next unless( $name && $model && $version );
...;
}
您使用的条件取决于您接受的值。在我的示例中,我假设所有值都需要为 true。如果它们不需要成为空字符串,也许您可以检查长度:
while (<FILE>) {
( $name, $model, $version ) = split(",");
next unless( length($name) && length($model) && length($version) );
...;
}
如果您知道如何验证每个字段,则可能有这些字段的子例程:
while (<FILE>) {
( $name, $model, $version ) = split(",");
next unless( length($name) && is_valid_model($model) && length($version) );
...;
}
sub is_valid_model { ... }
现在你只需要决定如何将其集成到你已经在做的事情中。
您应该首先将use strict
和use warnings
添加到程序的顶部,并在首次使用时声明所有带有my
的变量。这将揭示许多简单的错误,否则很难发现。
您还应该使用 open
和词法文件句柄的三个参数,用于检查打开文件异常的 Perl 习惯用法是将or die
添加到open
调用中。 if
带有成功路径空块的语句会浪费空间并变得不可读。open
调用应如下所示
open my $fh, '>', 'myfile' or die "Unable to open file: $!";
最后,在处理CSV文件时使用Perl模块要安全得多,因为使用简单的split /,/
有很多陷阱。Text::CSV
模块已为您完成了所有工作,可在CPAN上使用。
您的问题是,在读取第一个文件的末尾后,在第二个嵌套循环中再次从同一句柄读取之前,您不会倒带或重新打开它。这意味着不会再从该文件读取数据,程序的行为就像是空的一样。
为了配对相应的记录而通读同一文件数百次是一种糟糕的策略。如果文件的大小合理,则应在内存中构建数据结构来保存信息。Perl 哈希是理想的,因为它允许您立即查找与给定名称相对应的数据。
我已经编写了您的代码修订版来演示这些要点。测试代码对我来说会很尴尬,因为我没有示例数据,但如果您仍然遇到问题,请告诉我们。
use strict;
use warnings;
use Text::CSV;
my $csv = Text::CSV->new;
my %data;
# Read the name, model and version from the first file. Write any records
# that don't have the full three fields to the "MISSING" file
#
open my $f1, '<', 'file.txt' or die qq(Cannot open file 1: $!);
open my $missing, '>', 'Missing.csv'
or die qq(Unable to open "MISSING" file for output: $!);
# Lines with missing name fields will appear here.
while ( my $line = csv->getline($f1) ) {
my $name = $line->[0];
if (grep $_, @$line < 3) {
$csv->print($missing, $line);
}
else {
$data{$name} = $line if $name =~ /.net$/i;
}
}
close $missing;
# Put a list of .net names found into the testing file
#
open my $testing, '>', 'testing.csv'
or die qq(Unable to open "TESTING" file for output: $!);
# Names will be printed to this during testing. Only ".net" ending names should appear
print $testing "$_n" for sort keys %data;
close $testing;
# Read the name, disk and storage from the second file and check that the line
# contains all three fields. Remove the name field from the start and append
# to the data record with the matching name if it exists.
#
open my $f2, '<', 'file2.txt' or die qq(Cannot open file 2: $!);
while ( my $line = $csv->getline($f2) ) {
next unless grep $_, @$line >= 3;
my $name = shift @$line;
next unless $name =~ /.net$/i;
my $record = $data{$name};
push @$record, @$line if $record;
}
# Print the completed hash. Send each record to the result output if it
# has the required five fields
#
open my $result, '>', 'resultfile.csv' or die qq(Cannot open results file: $!);
$csv->print($result, qw( name Model version Disk storage ));
for my $name (sort keys %data) {
my $line = $data{$name};
if (grep $_, @$line >= 5) {
$csv->print($result, $data{$name});
}
}