如何获取字符串开头的相应字符数

我有两个字符串，例如$must和$is。它们一开始应该是相同的。但如果有错误，我想知道在哪里。

示例：

my $must = "abc;def;ghi";
my $is   = "abc;deX;ghi";

"；X〃；位置6上的结果不相等，所以这是我需要的结果。

所以我需要像这样的东西

my $count = count_equal_chars($is, $must);

其中结果是"0"；6〃；因为字符0到5是相等的。(不管是"5"，因为写入++没有问题。(

到目前为止我的代码：

编辑：添加了一个解决方法。

#!/usr/bin/perl
use strict;
use warnings;
use utf8;
my $head_spec = "company;customer;article;price"; # specified headline
my $count = 0;                                    # row counter
while (<DATA>) {
s/[rn]//;      # Data comes originally from Excel ...
if (!$count) {
# Headline:
## -> error message without position:
##print "error in headlinen" unless ($head_spec eq $_);
## -> writeout error message with position:
next if ($head_spec eq $_);
# Initialize char arrays and counter
my @spec = split //, $head_spec; # Specified headline as character-array
my @is   = split //, $_;         # Readed    headline as character-array
my $err_pos = 0;                 # counter - current position
# Find out the position:
for (@spec) {
$err_pos++, next if $is[$err_pos] eq $_;
last;
}
# Writeout error message
print "error in headline at position: $err_posn";
}
else {
# Values
print "process line $count, values: $_n";
}
}
continue { $count++; }
__DATA__
company;custXomer;article;price
Ser;0815;4711;3.99
Ser;0816;4712;4.85

背景：
背景是，有一个.csv文件的标题很长(>1000个字符(。指定了此标头。如果其中有错误，则该文件有错误，必须由用户编辑。因此，告诉他错误在哪里是有用的，这样他就不需要比较整条线。

使用xor、^，操作员可以找到错误位置。在顺序字母匹配的情况下，异或运算会给出。

$-{0]是最后一个匹配起始变量(对于前一行中的正则表达式，(($must ^ $is) =~ /[^]/(。

你可以这样找到位置：

#!/usr/bin/perl
use strict;
use warnings;
my $must = "company;customer;article;price";
my $is   = "company;custXomer;article;price";

($must ^ $is) =~ /[^]/; # find first non-matching character
print "Error position is ", $-[0];  # position of first non-matching char

打印：12

我们可以逐个字符进行比较，也可以将字符串的长度考虑到边缘情况：

use strict;
use warnings;
use List::Util qw<min max>;
my $must = "company;customer;article;price";
my $is   = "company;custXomer;article;price";
# to-be reported position
my $pos = 0;
# get minimum and maximum of the lengths
my @lengths = map length, ($must, $is);
my $min_length = min @lengths;
my $max_length = max @lengths;
# increment till an inequality occurs or a string is consumed fully
++$pos until substr($must, $pos, 1) ne substr($is, $pos, 1) || $pos == $min_length;
# report the result
print $pos == $min_length ? ($pos < $max_length ? "missing cols" : "no diff") : $pos;

如果在最后，位置等于最小长度，那么有两个选项：要么完全相等，要么一个更长，所以我们对照最大长度进行检查。否则，按原样报告位置。

以下代码

将行拆分为两个字符数组@must和@is
比较数组的长度，如果不同则发出警告
然后比较数组，直到第一次不匹配
如果$pos与$must字符串的最后一个索引不匹配，则打印错误

use strict;
use warnings;
use feature 'say';
my $must = "abc;def;ghi";
my $is   = "abc;deX;ghi";
my @must = split('', $must);
my @is   = split('', $is);
my $pos;
warn "Warning: length is differ"
unless $#must == $#is;
for ( 0..$#must ) {
$pos = $_;
last unless $must[$pos] eq $is[$pos];
}
say "Error: The strings differ at position $pos"
unless $pos == $#must;

输出

Error: The strings differ at position 6

相关内容

最新更新

热门标签：