我有一个巨大的xml文件,只有一部分我粘贴在这里:
<List NAME="ANDREW" ENROLED="2" FEE="640" CONFORMATION="I">
<DATA>
<HOUSE>
<PRIMARY GROUP_ID="37496" SECTION="A"/>
<PRIMARY GROUP_ID="37496" SECTION="B"/>
</HOUSE>
</DATA>
</List>
<List NAME="SAM" ENROLED="4" FEE="640" CONFORMATION="O">
<DATA>
<HOUSE>
<PRIMARY GROUP_ID="36816" SECTION="A"/>
<PRIMARY GROUP_ID="36816" SECTION="B"/>
</HOUSE>
</DATA>
</List>
<List NAME="MATHEW" ENROLED="3" FEE="467" CONFORMATION="I">
<DATA>
<HOUSE>
<PRIMARY GROUP_ID="37436" SECTION="A"/>
<PRIMARY GROUP_ID="37436" SECTION="B"/>
</HOUSE>
</DATA>
</List>
<List NAME="RAY" ENROLED="1" FEE="982" CONFORMATION="O">
<DATA>
<HOUSE>
<PRIMARY GROUP_ID="36892" SECTION="A"/>
<PRIMARY GROUP_ID="36892" SECTION="B"/>
</HOUSE>
</DATA>
</List>
我正在使用xml::TWIG
我必须检查"confirm"是否为我,然后获得"FEE"one_answers"GROUP_ID"并存储在单独的数组中另外,如果"formation"是"O",那么获取"FREE"one_answers"GROUP_ID"并将它们存储在不同的数组中。
use XML::Twig;
my $filename = 'report2.txt';
open( $fh, '>', $filename );
my $twig = new XML::Twig(
twig_roots => {
"List" => &add,
"List/DATA/HOUSE/PRIMARY" => &update
}
);
$twig->parsefile("file.xml");
#$twig->print;
sub add {
my ( $twig, $add ) = @_; # handlers params are always
$cond = $add->att('CONFORMATION');
$cond2 = $add->att('FEE');
if ( $cond == 'I' ) {
sub update {
my ( $twig, $update ) = @_;
$check = $update->att('GROUP_ID');
print $fh " GROUP_ID :$check ";
}
} elsif ( $cond == 'O' ) {
sub update {
my ( $twig, $update ) = @_;
$check = $update->att('GROUP_ID');
print $fh " GROUP_ID :$check ";
}
print $fh "CONFORMATION=$cond n GROUP_ID : $cond2";
}
}
close $fh;
print "donen";
现在我只是试着在日志中打印它们,以便我可以进一步移动。但是却搞砸了。
请帮助我是初学者PERL我的代码是这样的,打印所有的,但不是按顺序的
好吧,首先,把update
子子移到add
子子外面,这太脏了。
XML::Twig致力于拥有"发射"来解析XML代码片段的"处理程序"。这是处理大文件的一种非常轻量级的方式,因为XML的一个常见问题是它非常占用内存。
你做的事情太复杂了。
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
sub process_list {
my ( $twig, $list ) = @_;
my $conformation = $list -> att( 'CONFORMATION' );
my $fee = $list -> att ( 'FEE' );
foreach my $primary ( $list -> first_child ( 'DATA' ) -> first_child ('HOUSE') -> children() )
{
my $group_id = $primary -> att ( 'GROUP_ID' );
print "$conformation, $fee, $group_idn";
### here you have the information you need to do the rest of your processing.
}
}
my $parser = XML::Twig -> new ( 'twig_handers' => { 'List' => &process_list} );
$parser -> parsefile ( $xml_file );
每次解析器看到'List'元素时都会触发'handler',然后您可以提取所需的子元素和属性。children
给出要循环的元素列表。