我想找到由波浪(~
)包围的文本,并用一些字符串前缀文本,例如在XML文件中用~T1it~
替换~it~
,然后将结果保存到另一个文件。我知道如何使用XPath获取文本以及如何替换它,但是我不知道如何将替换的文本放在相应的位置并输出它。
这是我的输入XML:
<?xml version="1.0"?>
<chapter>
<section>
<para id="p001">this is<math>~rom~This is roman~normal~</math>para</para>
<para id="p002">this is<math>~rom~This is roman~normal~</math>para</para>
<para id="p003">this is<math>~rom~This is roman~normal~</math>para</para>
</section>
<abstract>
<para id="p004">This is <math>~rom~This is roman~normal~</math>para</para>
<para id="p005">this is<math>~rom~This is roman~normal~</math>para</para>
<para id="p006">this is<math>~rom~This is roman~normal~</math>para</para>
</abstract>
</chapter>
下面是我的Perl脚本:
use strict;
use warnings;
use XML::LibXML;
#use XML::LibXML::Text;
use Cwd 'abs_path';
my $x_name=abs_path($ARGV[0]);
my $doc = XML::LibXML->load_xml(location => $x_name, no_blanks => 1);
my $xpath_expression='/chapter/section/para/math';
my @nodes = $doc->findnodes( $xpath_expression );
foreach my $node(@nodes){
my $content = $node->textContent;
$content=~s#~rom~#~T1rom~#sg;
print $content,"n";
}
下面是我想要的输出:
<?xml version="1.0"?>
<chapter>
<section>
<para id="p001">this is<math>~T1rom~This is roman~normal~</math>para</para>
<para id="p002">this is<math>~T1rom~This is roman~normal~</math>para</para>
<para id="p003">this is<math>~T1rom~This is roman~normal~</math>para</para>
</section>
<abstract>
<para id="p004">This is <math>~rom~This is roman~normal~</math>para</para>
<para id="p005">this is<math>~rom~This is roman~normal~</math>para</para>
<para id="p006">this is<math>~rom~This is roman~normal~</math>para</para>
</abstract>
</chapter>
一种可能性:使用XML::LibXML::Text
的setData
方法:
#!/usr/bin/perl
use warnings;
use strict;
use XML::LibXML;
my $x_name = $ARGV[0];
my $doc = XML::LibXML->load_xml(location => $x_name, no_blanks => 1);
my $xpath_expression = '/chapter/section/para/math/text()';
my @nodes = $doc->findnodes( $xpath_expression );
for my $node (@nodes) {
my $content = $node->toString;
$content =~ s#~rom~#~T1rom~#sg;
$node->setData($content);
}
$doc->toFile($x_name . '.new', 1);