分析::FixedLength修剪问题



在我之前的问题中,我问过如何避免Parse::FixedLength修剪所有零。@bolav建议的代码适用于我使用的样本数据,但不知何故,它似乎不适用于我的新数据。

它似乎应该工作,但不知何故,它正在为这些数据修剪所有零。我可能犯了一个非常明显的错误,但我不明白是什么。感谢你的帮助。

    #!/usr/bin/perl
    use strict;
    use warnings; 
    use Parse::FixedLength;
    use Data::Dumper;
    my $parser = Parse::FixedLength->new([
              field1 => '12R0:1:12',
              field2 => '2:13:14',
              field3 => '5R0:15:19',
              field4 => '10R0:20:29',
              field5 => '2R0:30:31',
              field6 => '3R0:32:34'
              ], {trim => '1'});
    $parser->{TPAD}[0] = qr/^0+(?=d)/;   # Modification suggested by @bolav
    while (<DATA>) {
        warn "No record terminator found!n" unless chomp;
        warn "Short Record!n" unless $parser->length == length;
        my $data = $parser->parse($_);
        print Dumper $data;
    }
    __DATA__
    119401122910XX42152931177771001000
    119401122910XX42152931177771001010

最后一个字段应该是010,但它输出blank10更新:我不希望field6的输出是000010——我只需删除trim选项就可以得到。regex本应解决这个问题,但由于某些原因,它没有做到这一点。

$VAR1 = bless( {
                 'field1' => '119401122910',
                 'field6' => '',
                 'field4' => '9311777710',
                 'field2' => 'XX',
                 'field3' => '42152',
                 'field5' => '1'
               }, 'Parse::FixedLength::HashAsObj::Href1' );
$VAR1 = bless( {
                 'field1' => '119401122910',
                 'field6' => '10',
                 'field4' => '9311777710',
                 'field2' => 'XX',
                 'field3' => '42152',
                 'field5' => '1'
               }, 'Parse::FixedLength::HashAsObj::Href1' );

更新

好吧,我读过你最初的问题,更清楚你想要什么。你真的应该让每个问题都独立起来——Stack Overflow不是一个论坛

建议的修改不适用于此配置的原因是$parser->{TPAD}是要从每个对齐字段的前面删除的正则表达式的数组。在您的情况下,它是除field2之外的所有字段。您只修改了数组的第一个元素,因此只修复了field1

这里有一个更通用的修改,它更改$parser->{TPAD}数组的每个元素,使其始终至少保留字段的最后一个字符,无论是什么。请注意,如果您的填充字符是5R等格式的空格,那么它将把一个全空格字段修剪为一个空格,而不是清空

use strict;
use warnings;
use Parse::FixedLength;
use Data::Dump;
my $parser = Parse::FixedLength->new(
    [   field1 => '12R0:1:12',
        field2 => '2:13:14',
        field3 => '5R0:15:19',
        field4 => '10R0:20:29',
        field5 => '2R0:30:31',
        field6 => '3R0:32:34'
    ],
    { trim => 1 }
);
$_ = qr/$_(?=.)/ for @{ $parser->{TPAD} };
while (<DATA>) {
    my $data = $parser->parse($_);
    dd $data;
}
__DATA__
119401122910XX42152931177771001000
119401122910XX42152931177771001010

输出

bless({
  field1 => 119401122910,
  field2 => "XX",
  field3 => 42152,
  field4 => 9311777710,
  field5 => 1,
  field6 => 0,
}, "Parse::FixedLength::HashAsObj::Href1")
bless({
  field1 => 119401122910,
  field2 => "XX",
  field3 => 42152,
  field4 => 9311777710,
  field5 => 1,
  field6 => 10,
}, "Parse::FixedLength::HashAsObj::Href1")



我注意到,如果您简单地删除trim => 1选项,您的代码就会产生您想要的结果。然而,我想你有理由想要这样做,所以这里有一个解决方案

由于Parse::FixedLength允许使用pack模板元素,因此您可以显式指定一个A字段来实际传输数据。它与模块用于其他字段的模板相同,但它禁用了该字段的trim选项

此代码按照的要求执行

    use strict;
    use warnings;
    use Parse::FixedLength;
    use Data::Dump;
    my $parser = Parse::FixedLength->new([
              field1 => '12R0:1:12',
              field2 => '2:13:14',
              field3 => '5R0:15:19',
              field4 => '10R0:20:29',
              field5 => 'A2:30:31',
              field6 => 'A3:32:34'
              ], {trim => '1'});
    while ( <DATA> ) {
        my $data = $parser->parse($_);
        dd $data;
    }
    __DATA__
    119401122910XX42152931177771001000
    119401122910XX42152931177771001010

输出

    bless({
      field1 => 119401122910,
      field2 => "XX",
      field3 => 42152,
      field4 => 9311777710,
      field5 => "01",
      field6 => "000",
    }, "Parse::FixedLength::HashAsObj::Href1")
    bless({
      field1 => 119401122910,
      field2 => "XX",
      field3 => 42152,
      field4 => 9311777710,
      field5 => "01",
      field6 => "010",
    }, "Parse::FixedLength::HashAsObj::Href1")

您只需将修剪到骨骼的值替换为0:

my $data = $parser->parse($_);
for my $val (values %$data) {
    $val =~ s/^$/0/  #If the val is blank, replace with a 0
}

下面是一个完整的例子:

use strict;
use warnings; 
use 5.020;
use Parse::FixedLength;
use Data::Dumper;
my $parser = Parse::FixedLength->new([
          field1 => '12R0:1:12',
          field2 => '2:13:14',
          field3 => '5R0:15:19',
          field4 => '10R0:20:29',
          field5 => '2R0:30:31',
          field6 => '3R0:32:34'
          ], {trim => '1'});
#$parser->{TPAD}[0] = qr/^0+(?=d)/;   # Modification suggested by @bolav
while (<DATA>) {
    warn "No record terminator found!n" unless chomp;
    warn "Short Record!n" unless $parser->length == length;
    my $data = $parser->parse($_);
    for my $val (values %$data) {
        $val =~ s/^$/0/
    }
    #s/(w+)/uL$1/g for @$data{qw(first_name last_name)};
    print Dumper $data;
}
__DATA__
119401122910XX42152931177771001000
119401122910XX42152931177771001010

输出:

$VAR1 = bless( {
                 'field1' => '119401122910',
                 'field5' => '1',
                 'field2' => 'XX',
                 'field3' => '42152',
                 'field4' => '9311777710',
                 'field6' => '0'
               }, 'Parse::FixedLength::HashAsObj::Href1' );
$VAR1 = bless( {
                 'field1' => '119401122910',
                 'field5' => '1',
                 'field2' => 'XX',
                 'field3' => '42152',
                 'field4' => '9311777710',
                 'field6' => '10'
               }, 'Parse::FixedLength::HashAsObj::Href1' );

您可以直接使用unpack()

use strict;
use warnings; 
use 5.020;
use Data::Dumper;
=begin
print " ";
for my $i (1..3) {
    printf '%10s', $i;
}
print("n");
say "0123456789" x 4;
say "119401122910XX42152931177771001000";
--output:--
          1         2         3
0123456789012345678901234567890123456789
119401122910XX42152931177771001000
=cut
# @12 => start at index position 12 in the record (0 based indexing)
# A5  => read 5 characters
my $pattern = <<'END_OF_PATTERN';
@0      A12 
@12     A2
@14     A5
@19     A10
@29     A2
@31     A3
END_OF_PATTERN

while (my $line = <DATA>) {
    my @fields = unpack $pattern, $line;
    for my $field (@fields[-2, -1]) {
        $field =~ s/
                        ^       #Match start of string, followed by...
                        0*      #A literal 0, zero or more times (greedy), followed by...
                        (d+)   #A digit, one or more times, captured in group 1, followed by...
                        $       #The end of the string.
                  /$1/xms;   #Replace all the above with capture group 1.      
    }
    say Dumper @fields;
    say '-' x 10;
}

__DATA__
119401122910XX42152931177771001000
119401122910XX42152931177771001010

输出:

$VAR1 = '119401122910';
$VAR2 = 'XX';
$VAR3 = '42152';
$VAR4 = '9311777710';
$VAR5 = '1';
$VAR6 = '0';
----------
$VAR1 = '119401122910';
$VAR2 = 'XX';
$VAR3 = '42152';
$VAR4 = '9311777710';
$VAR5 = '1';
$VAR6 = '10';
----------

注意,你可以从任何你想要的索引开始,你可以重读记录的部分,等等。例如:

@0  A10    #Start at index 0, read 10 characters
@0  A5     #Go back to index 0, read 5 charters
@20 A2     #Jump to index 20, read 2 characters
@18 A12    #Go back to index 18, read 12 characters

最新更新