在数组中搜索相似对象



给定一个这样的数组数组:

$array = array( 
0 => array (
0 => 35,
1 => 30, 
2 => 39
),
1 => array (
0 => 20,
1 => 12, 
2 => 5
),
...
n => array (
0 => 10,
1 => 15, 
2 => 7
),
);

我需要在数组中找到更接近给定参数的条目

find($a, $b, $c) {
//return the closer entry to the input
}

对于更近的条目,我的意思是与输入中给出的值更接近的条目,例如传递(19,13,3),它应该返回$array[1]

我目前进行计算的方式是遍历整个数组,保留一个从 -1 开始的变量$distance和一个临时$result变量。对于每个元素,我计算距离

$dist = abs( subarray[0] - $a ) + abs ( subarray[1] - $b ) + abs( subarray[2] - $c ) 

如果计算的距离等于 -1 或小于出圈的变量$distance,我将新距离分配给变量并将相应的数组保存在 $result 变量中。在循环结束时,我最终获得了所需的值。

此外,其中一个值可以为空:例如(19, 13, false)仍应返回$array[1],然后计算应忽略缺少的参数 - 在这种情况下,距离计算为

$dist = abs( subarray[0] - $a ) + abs ( subarray[1] - $b );

忽略子数组[2] 和 $c 的值。

问题是,即使我的代码正常工作,执行也花费了太多时间,因为数组的大小很容易达到数十万个元素。我们仍在谈论毫秒,但由于各种原因,这仍然是不可接受的。 有没有更有效的方法来进行此搜索以节省一些时间?

一个自定义函数 - 也许有更好的方法,但请检查一下:

简而言之:

搜索所有项目,并以百分比找到它检查的数字($mArray[0...3])和你给出的数字($mNumbersToFind[0...3])之间的差异。将(每个元素的)所有三个数字的可能性相加 - 找到最大值 - 保持位置并返回数组。

$array = array( 
array (
0 => 13,
1 => 15, 
2 => 4
),
array (
0 => 20,
1 => 12, 
2 => 5
),
array (
0 => 13,
1 => 3, 
2 => 15
),
);

$mNumbersToFind = array(13,3,3);
$mFoundArray = find($mNumbersToFind, $array);
echo "mFinalArray : <pre>";
print_r($mFoundArray);

function find($mNumbersToFind, $mArray){
$mPossibilityMax = count($mNumbersToFind);
$mBiggestPossibilityElementPosition = 0;
$mBiggestPossibilityUntilNow = 0;

foreach($mArray as $index => $current){
$maxPossibility = 0;
foreach($current as $subindex => $subcurrent){

$mTempArray[$index][$subindex]['value'] = $subcurrent - $mNumbersToFind[$subindex];
$percentChange = (1 - $mTempArray[$index][$subindex]['value'] / $subcurrent) * 100;
$mTempArray[$index][$subindex]['possibility'] = $percentChange;
$maxPossibility += $percentChange/$mPossibilityMax;

}

$mTempArray[$index]['final_possibility'] = $maxPossibility;
if($maxPossibility > $mBiggestPossibilityUntilNow){
$mBiggestPossibilityUntilNow = $maxPossibility;
$mBiggestPossibilityElementPosition = $index;
}

}
echo "mTempArray : <pre>"; // Remove this - it's just for debug
print_r($mTempArray); // Remove this - it's just for debug
return $mArray[$mBiggestPossibilityElementPosition];
}

调试输出 ($mTempArray):

mTempArray :
Array
(
[0] => Array
(
[0] => Array
(
[value] => 0
[possibility] => 100
)
[1] => Array
(
[value] => 12
[possibility] => 20
)
[2] => Array
(
[value] => 1
[possibility] => 75
)
[final_possibility] => 65
)
[1] => Array
(
[0] => Array
(
[value] => 7
[possibility] => 65
)
[1] => Array
(
[value] => 9
[possibility] => 25
)
[2] => Array
(
[value] => 2
[possibility] => 60
)
[final_possibility] => 50
)
[2] => Array
(
[0] => Array
(
[value] => 0
[possibility] => 100
)
[1] => Array
(
[value] => 0
[possibility] => 100
)
[2] => Array
(
[value] => 12
[possibility] => 20
)
[final_possibility] => 73.333333333333
)
)

最终输出 :

mFinalArray : 
Array
(
[0] => 13
[1] => 3
[2] => 15
)

我基本上使用了接近的概念(每个数组的总距离较小)并返回了它。代码的制作方式可以在许多例程中得到很好的改进。

PS:我没有使用高级功能或其他东西,因为您担心性能问题。这是我在短时间内能做的最简单的例行公事。

$array = array(
0 => array (
0 => 35,
1 => 30,
2 => 39
),
1 => array (
0 => 20,
1 => 12,
2 => 5
),
);
$user = array(19,13,3);
function find($referencial, $input){
$totalRef = count($referencial);
if (is_array($referencial)){
for ($i = 0; $i < $totalRef; $i++) {
if (is_array($referencial[$i])){
$totalSubRef = count($referencial[$i]);
$proximity = array();
for ($j = 0; $j < $totalSubRef; $j++) {
$proximity[$i] += abs($referencial[$i][$j] - $input[$j]);
}
if ($i > 0){
if ($maxProximity['distance'] > $proximity[$i]) {
$maxProximity['distance'] = $proximity[$i];
$maxProximity['index'] = $i;
}
} else {
$maxProximity['distance'] = $proximity[$i];
$maxProximity['index'] = $i;
}
}
}
return $maxProximity;
} else {
exit('Unexpected referencial. Must be an array.');
}
}
$found = find($array, $user);
print_r($found);
//Array ( [distance] => 4 [index] => 1 )
print_r($array[$found['index']]);
// Array ( [0] => 20 [1] => 12 [2] => 5 )

最新更新