Perl-regex不区分大小写



我试图在标记为"v{G}"的句子中找到一些特殊字符或"v {g}";将它们替换为"Ǧ"one_answers"ǧ",并将更正后的句子保存在新的HTML文件中。

我的正则表达式(.*)\v{(w)}(.*)找到要替换的字符,但我不能根据其大小写替换字符:结果文件包含:

This is a sentence ǧ with a upper case G.
This is a sentence ǧ with a lower case g. 

代替:

This is a sentence Ǧ with a upper case G.
This is a sentence ǧ with a lower case g.

兆瓦

HTML输入文件包含:

This is a sentence v{G} with a upper case G.
This is a sentence v{g} with a lower case g.

perl文件包含:

use strict;
use warnings;
# Define variables
my ($inputfile, $outputfile, $inputone, $inputtwo, $part1, $specialcharacter, $part2);
# Initialize variables
$inputfile = "TestFile.html";
$outputfile = 'Results.html';
# Open output file
open(my $ofh, '>:encoding(UTF-8)', "$outputfile");
# Open input file
open(my $ifh, '<:encoding(UTF-8)', "$inputfile");
# Read input file
while(<$ifh>) {
# Analyse _temp.html file to identify special characters
($part1, $specialcharacter, $part2) = ($_ =~ /(.*)\v{(w)}(.*)/);
if ($specialcharacter == "g") {
$specialcharacter = "&#487";
}elsif ($specialcharacter == "G") {
$specialcharacter = "&#486";# PROBLEM 
}
say $ofh "tt<p>$part1$specialcharacter$part2";
}
# Close input and output files
close $ifh;
close $ofh;

如注释中所述,==是错误的操作符。您应该使用eq来比较非数值标量。

另一种方法是创建一种形式的字典,一个查找表,并在其中查找您的特殊字符。

# A map between the special characters and the html code you want in its place.
# Fill it with more if you've got them.
my %SpecialMap = (
'g' => '&#487;',
'G' => '&#486;',
);
# Read input file
while(<$ifh>) {
# loop for as long as v{character} is found in $_
while(/\v{(w)}/) {
# Look up the character in the dictionary.
# Fallback if it's not in the map: Use the character as-is instead.
my $ch = $SpecialMap{$1} || $1;
# Rebuild $_
$_ = $` . $ch . $';
}
# print the result
print $ofh $_;
}

对于输入

Both v{g} and v{G} in here.
This is a sentence v{g} with a lower case g.
This is a sentence v{H} with a upper case H which is not in the map.
This contains nothing special.

它将产生如下输出:

Both &#487; and &#486; in here.
This is a sentence &#487; with a lower case g.
This is a sentence H with a upper case H which is not in the map.
This contains nothing special.

受Polar Bear评论的启发,您可以使用s///ge来执行映射函数,并得到相同的结果:

my %SpecialMap = (
'g' => '&#487;',
'G' => '&#486;',
);
sub mapfunc {
return $SpecialMap{$1} || $1;
}
# Read input file
while(<$ifh>) {
# /g substitute all matches on the line
# /e by executing mapfunc($1) for each
s/\v{(w)}/mapfunc($1)/ge;
print $ofh $_;
}

最新更新