哈希列表的参数排序



我的目标是编写一个子例程,它接受

  1. 哈希数组
  2. 包含排序顺序的列表

需要明确的是- 钥匙可能是任何东西。我的例子仅供参考。


给定一个数组,其中包含按所需排序顺序排列的键列表

my @aSortOrder = ( 'DELTA1_2', 'SET1', 'SET2' );

我的想法是形成一个字符串

$a->{DELTA1_2} <=> $b->{DELTA1_2} or $a->{SET1} <=> $b->{SET1} or $a->{SET2} <=> $b->{SET2}

然后用eval执行它。

这是我的代码

my $paRecords = [
{ 'SET1' => 48265, 'DELTA1_2' => -1,  'SET2' => 48264 },
{ 'SET1' => 8328,  'DELTA1_2' => -29, 'SET2' => 8299 },
{ 'SET1' => 20,    'DELTA1_2' => 0,   'SET2' => 0 },
{ 'SET1' => 10,    'DELTA1_2' => 0,   'SET2' => 0 }
];
my @aSortOrder = ( 'DELTA1_2', 'SET1', 'SET2' );
my $pStr = '';
foreach ( @aSortOrder ) {
$pStr = $pStr . ' or $a->{' . $_ . '} <=> $b->{' . $_ . '}';
}
$pStr =~ s/^s*ors*//;
my @aSorted = sort { eval "$pStr"; } @$paRecords;
print Dumper @aSorted;

输出

$VAR1 = [
{
'SET1' => 8328,
'SET2' => 8299,
'DELTA1_2' => -29
},
{
'SET1' => 48265,
'SET2' => 48264,
'DELTA1_2' => -1
},
{
'SET2' => 0,
'DELTA1_2' => 0,
'SET1' => 10
},
{
'SET2' => 0,
'DELTA1_2' => 0,
'SET1' => 20
}
];

我想这远非解决问题的理想方法,因此任何关于如何更好地解决这个问题的指针都将是一个很大的帮助。

只需创建一个进行比较的子。

sub custom_cmp {
my $keys = shift;
for my $key (@$keys) {
my $cmp = $_[0]{$key} <=> $_[1]{$key};
return $cmp if $cmp;
}
return 0;
}
my @aSorted = sort { custom_cmp(@aSortOrder, $a, $b) } @$paRecords;

以上对每个比较进行了两个子调用。如果我们生成比较函数,我们可以将其减少到一个。

sub make_custom_cmp {
my @keys = @_;
return sub($$) {
for my $key (@keys) {
my $cmp = $_[0]{$key} <=> $_[1]{$key};
return $cmp if $cmp;
}
return 0;
};
}
my $cmp = make_custom_cmp(@aSortOrder);
my @aSorted = sort $cmp @$paRecords;

我们可以更进一步,通过代码生成来扁平化循环。这就是基于评估的"适当"解决方案的样子。但是,几乎不需要这种级别的优化。

sub make_custom_cmp {
my @keys = @_;
my @cmps;
for $i (0..$#keys) {
push @cmps, "$_[0]{$keys[$i]} <=> $_[1]{$keys[$i]}"
}
return eval("sub($$) { ".( join(" || ", @cmps) )."}");
}
my $cmp = make_custom_cmp(@aSortOrder);
my @aSorted = sort $cmp @$paRecords;

事实上,以下可能是性能最高的解决方案:

my @aSorted =
map $paRecords->[ unpack('N', substr($_, -4))-0x7FFFFFFF ],
sort
map pack('N*', map $_+0x7FFFFFFF, @{ $paRecords->[$_] }{@aSortOrder}, $_),
0..$#$paRecords;

传递给sort的块可能包含任意数量的代码。只需要根据是否应将其视为小于、等于或大于$b$a计算为负数、零或正数

我同意您将其捆绑到子例程中的决定,因此我编写了sort_hashes_by_keys,它期望对要排序的哈希数组的引用以及对键字符串数组的引用。它返回根据键列表排序的哈希列表

use strict;
use warnings 'all';
use Data::Dump 'dd';
my $records =  [
{ SET1 => 48265, DELTA1_2 => -1,  SET2 => 48264 },
{ SET1 => 8328,  DELTA1_2 => -29, SET2 => 8299  },
{ SET1 => 20,    DELTA1_2 => 0,   SET2 => 0     },
{ SET1 => 10,    DELTA1_2 => 0,   SET2 => 0     }
];
my @sort_order = qw/ DELTA1_2 SET1 SET2 /;
my @sorted = sort_hashes_by_keys( $records, @sort_order );
dd @sorted;

sub sort_hashes_by_keys {
my ( $hashes, $order ) = @_;
sort {
my $cmp = 0;
for my $key ( @$order ) {
last if $cmp = $a->{$key} <=> $b->{$key};
}
$cmp;
} @$hashes;
}

输出

[
{ DELTA1_2 => -29, SET1 => 8328, SET2 => 8299 },
{ DELTA1_2 => -1, SET1 => 48265, SET2 => 48264 },
{ DELTA1_2 => 0, SET1 => 10, SET2 => 0 },
{ DELTA1_2 => 0, SET1 => 20, SET2 => 0 },
]


请注意,我强烈建议在命名变量时不要同时使用匈牙利符号骆驼大小写。Perl 不是严格类型的,它有像$@%这样的符号来指示每个变量的类型,所以匈牙利符号充其量是多余的,并且还会增加分散注意力和不相关的噪音。此外,按照惯例,大写字母保留用于模块名称和全局变量,因此局部标识符应采用"蛇形大小写",即小写字母和下划线。许多非英语人士也发现骆驼大小写难以解析

嗯,你说得很对——使用这样的eval是通往未来痛苦的道路。

"sort"的乐趣在于,你可以定义一个排序子例程,它隐式地定义了$a$b,你可以使用任何你想要的逻辑来决定它是正、负还是"零"比较(等于)。(例如,像<=>cmp一样)。

这里的诀窍是 - "true"是任何非零的东西,所以<=>你可以测试"true"以查看是否有比较(4 <=> 4是"假")

因此,如果您只是在数字上工作(您需要测试"字母数字"并在某些情况下使用cmp,但似乎不适用于您的数据):

#!/usr/bin/env perl
use strict;
use warnings;
my $paRecords = [
{ 'SET1' => 48265, 'DELTA1_2' => -1,  'SET2' => 48264 },
{ 'SET1' => 8328,  'DELTA1_2' => -29, 'SET2' => 8299 },
{ 'SET1' => 20,    'DELTA1_2' => 0,   'SET2' => 0 },
{ 'SET1' => 10,    'DELTA1_2' => 0,   'SET2' => 0 }
];
#qw is 'quote-words' and just lets you space delimit terms. 
#it's semantically the same as ( 'DELTA1_2', 'SET1', 'SET2' );
my @order = qw ( DELTA1_2 SET1 SET2 );
#note - needs to come after definition of `@order` but it can be re-written later as long as it's in scope. 
#you can pass an order explicitly into the subroutine if you want though. 
sub order_by {
for my $key (@order) {
#compare key
my $result = $a->{$key} <=> $b->{$key};
#return it and exit the loop if they aren't equal, otherwise 
#continue iterating sort terms. 
return $result if $result;
}
return 0; #all keys were similar, therefore return zero.
}
print join (",", @order), "n";
foreach my $record ( sort {order_by} @$paRecords ) {
#use hash slice to order output in 'sort order'. 
#optional, but hopefully clarifies what's going on. 
print join (",", @{$record}{@order}), "n";
}

鉴于您的数据输出,这将:

DELTA1_2,SET1,SET2
-29,8328,8299
-1,48265,48264
0,10,0
0,20,0

请注意,我选择使用哈希切片作为输出,因为否则哈希是无序的,因此您的Dumper输出将不一致(随机排序的字段)。

如果您需要对排序更加动态,可以将其传递到 sort-sub:

#!/usr/bin/env perl
use strict;
use warnings;
sub order_by {
for my $key (@_) {
#compare key
my $result = $a->{$key} <=> $b->{$key};
#return it and exit the loop if they aren't equal, otherwise
#continue iterating sort terms.
return $result if $result;
}
return 0;    #all keys were similar, therefore return zero.
}
my $paRecords = [
{ 'SET1' => 48265, 'DELTA1_2' => -1,  'SET2' => 48264 },
{ 'SET1' => 8328,  'DELTA1_2' => -29, 'SET2' => 8299 },
{ 'SET1' => 20,    'DELTA1_2' => 0,   'SET2' => 0 },
{ 'SET1' => 10,    'DELTA1_2' => 0,   'SET2' => 0 }
];
#qw is 'quote-words' and just lets you space delimit terms.
#it's semantically the same as ( 'DELTA1_2', 'SET1', 'SET2' );
my @order = qw ( DELTA1_2 SET1 SET2 );
print join( ",", @order ), "n";
foreach my $record ( sort {order_by ( @order ) } @$paRecords ) {
#use hash slice to order output in 'sort order'.
#optional, but hopefully clarifies what's going on.
print join( ",", @{$record}{@order} ), "n";
}

最新更新