使用perl TWIG对属性进行比较提取并保存到数组中



我有一个巨大的xml文件,只有一部分我粘贴在这里:

<List NAME="ANDREW" ENROLED="2" FEE="640" CONFORMATION="I"> 
 <DATA>
   <HOUSE>
    <PRIMARY GROUP_ID="37496" SECTION="A"/>
    <PRIMARY GROUP_ID="37496" SECTION="B"/>
   </HOUSE>
  </DATA>
 </List>
 <List NAME="SAM" ENROLED="4" FEE="640"  CONFORMATION="O">
  <DATA>
   <HOUSE>
    <PRIMARY GROUP_ID="36816" SECTION="A"/>
    <PRIMARY GROUP_ID="36816" SECTION="B"/>
   </HOUSE>
  </DATA>
 </List>
  <List NAME="MATHEW" ENROLED="3" FEE="467" CONFORMATION="I">
 <DATA>
   <HOUSE>
    <PRIMARY GROUP_ID="37436" SECTION="A"/>
    <PRIMARY GROUP_ID="37436" SECTION="B"/>
   </HOUSE>
  </DATA>
 </List>
 <List NAME="RAY" ENROLED="1" FEE="982"   CONFORMATION="O">
  <DATA>
   <HOUSE>
    <PRIMARY GROUP_ID="36892" SECTION="A"/>
    <PRIMARY GROUP_ID="36892" SECTION="B"/>
   </HOUSE>
  </DATA>
 </List>

我正在使用xml::TWIG

我必须检查"confirm"是否为我,然后获得"FEE"one_answers"GROUP_ID"并存储在单独的数组中另外,如果"formation"是"O",那么获取"FREE"one_answers"GROUP_ID"并将它们存储在不同的数组中。

use XML::Twig;
my $filename = 'report2.txt';
open( $fh, '>', $filename );
my $twig = new XML::Twig(
    twig_roots => {
        "List"                    => &add,
        "List/DATA/HOUSE/PRIMARY" => &update
      }
);
$twig->parsefile("file.xml");
#$twig->print;
sub add  {
    my ( $twig, $add ) = @_;    # handlers params are always
    $cond = $add->att('CONFORMATION');
    $cond2 = $add->att('FEE');
    if ( $cond == 'I' ) {
        sub update {
            my ( $twig, $update ) = @_;
            $check = $update->att('GROUP_ID');
            print $fh " GROUP_ID :$check ";
        }
    } elsif ( $cond == 'O' ) {
        sub update {
            my ( $twig, $update ) = @_;
            $check = $update->att('GROUP_ID');
            print $fh " GROUP_ID :$check ";
        }
        print $fh "CONFORMATION=$cond n GROUP_ID : $cond2";
    }
}
close $fh;
print "donen";

现在我只是试着在日志中打印它们,以便我可以进一步移动。但是却搞砸了。

请帮助我是初学者PERL我的代码是这样的,打印所有的,但不是按顺序的

好吧,首先,把update子子移到add子子外面,这太脏了。

XML::Twig致力于拥有"发射"来解析XML代码片段的"处理程序"。这是处理大文件的一种非常轻量级的方式,因为XML的一个常见问题是它非常占用内存。

你做的事情太复杂了。

#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
sub process_list {
    my ( $twig, $list ) = @_;
    my $conformation = $list -> att( 'CONFORMATION' );
    my $fee = $list -> att ( 'FEE' );
    foreach my $primary ( $list -> first_child ( 'DATA' ) -> first_child ('HOUSE') -> children() )
    {
        my $group_id = $primary -> att ( 'GROUP_ID' );
        print "$conformation, $fee, $group_idn";
         ### here you have the information you need to do the rest of your processing. 
    }
}
my $parser = XML::Twig -> new ( 'twig_handers' => { 'List' => &process_list} );
$parser -> parsefile ( $xml_file );

每次解析器看到'List'元素时都会触发'handler',然后您可以提取所需的子元素和属性。children给出要循环的元素列表。

最新更新