滤波器仅从PHP阵列中复制URL



这是一个数组

Array ( 
   [EM Debt] => http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0616502026&culture=en-GB 
   [EM Local Debt] => Will be launched shortly 
   [EM Blended Debt] => Will be launched shortly 
   [Frontier Markets] => http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0501220262 
   [Absolute Return Debt and FX] => Will be launched shortly 
   [Em Debt] => http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0501220262 
) 

如果我使用array_unique(),它也将从数组中过滤Will be launched shortly

我只想过滤重复的URL,而不是文字。

更新:

我需要保持数组订单保持一样,只需过滤dupl

好吧,您可以使用array_filter

$filtered = array_filter($urls, function ($url) {
    static $used = [];
    if (filter_var($url, FILTER_VALIDATE_URL)) {
        return isset($used[$url]) ? false : $used[$url] = true;
    }
    return true;
});

这是演示。

这是您的答案:

<?php
// taking just example here, replace `$array` with yours
$array = ['http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0616502026&culture=en-GB', 'abc', 'abc', 'http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0616502026&culture=en-GB'];
$url_array = [];
foreach($array as $ele) {
    if(strpos($ele, 'http://') !== false) {
        $url_array[] = $ele;
    } else {
        $string_array[] = $ele;
    }
}
$url_array = array_unique($url_array);
print_r(array_merge($string_array, $url_array));
?>

您可以一次穿越数组以获取结果,在此过程中,您需要使用额外的数组来指示您保存在结果中的哪个URL。

$saved_urls = [];
$result = [];
foreach($array as $k => $v)
{
    if('http://' == substr(trim($v), 0, 7) || 'https://' == substr(trim($v), 0, 8))
    {
        if(!isset($saved_urls[$v]))    // check if the url have saved
        {
            $result[$k] = $v;
            $saved_urls[$v] = 1;
        }
    }else
        $result[$k] = $v;
}

如果要修改输入数组,而不是生成新的过滤阵列,则可以使用strpos()识别URL,lookup数组以识别重复的URL和unset()来修改数组。

  • strpos($v,'http')===0不仅需要http在字符串中,还要求它是字符串中的前四个字符。需要明确的是,这也可容纳https。当简单地检查子字符串的存在或位置时,strstr()substr()始终比strpos()效率低。(第二笔记 @ php手册的strstr()仅在检查存在时使用strpos()的好处。)
  • 使用迭代的in_array()调用来检查$lookup数组,效率不如将重复的URL作为钥匙存储在查找数组中。isset()每次都会胜过in_array()。(参考链接)
  • OP的样本输入并未表明有任何以http开头但不是URL的猴子扭曲值,也不是以http开头的非urll。因此,strpos()是一个合适且轻巧的功能调用。如果可能出现麻烦的URL,则Sevavietl的URL验证是一个更可靠的功能调用。(PHP手册链接)
  • 从我的在线性能测试中,我的答案是发布最快的方法,它提供了所需的输出数组。

代码:(演示)

$array=[
    'EM Debt'=>'http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0616502026&culture=en-GB',
    'EM Local Debt'=>'Will be launched shortly',
    'EM Blended Debt'=>'Will be launched shortly',
    'Frontier Markets'=>'http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0501220262',
    'Absolute Return Debt and FX'=>'Will be launched shortly',
    'Em Debt'=>'http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0501220262'
];
foreach($array as $k=>$v){
    if(isset($lookup[$v])){          // $v is a duplicate
        unset($array[$k]);           // remove it from $array
    }elseif(strpos($v,'http')===0){  // $v is a url (because starts with http or https)
        $lookup[$v]='';              // store $v in $lookup as a key to an empty string
    }
}
var_export($array);

输出:

array (
  'EM Debt' => 'http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0616502026&culture=en-GB',
  'EM Local Debt' => 'Will be launched shortly',
  'EM Blended Debt' => 'Will be launched shortly',
  'Frontier Markets' => 'http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0501220262',
  'Absolute Return Debt and FX' => 'Will be launched shortly',
)

只是为了娱乐,功能性/非正统/综合方法看起来像这样(不建议,纯粹是演示):

var_export(
    array_intersect_key(
        $array,                                    // use $array to preserve order
        array_merge(                               // combine filtered urls and unfiltered non-urls
            array_unique(                          // remove duplicates
                array_filter($array,function($v){  // generate array of urls
                    return strpos($v,'http')===0;
                })
            ),
            array_filter($array,function($v){  // generate array of non-urls
                return strpos($v,'http')!==0;
            })
        )
    )
);

好吧,这是我得到答案

$urls = ( [EM Debt] => http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0616502026&culture=en-GB 
[EM Local Debt] => Will be launched shortly 
[EM Blended Debt] => Will be launched shortly 
[Frontier Markets] => http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0501220262 [Absolute Return Debt and FX] => Will be launched shortly [Em Debt] => http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0501220262 );
$url_array = [];
foreach($urls as $title => $url) {
    if(strpos($url, 'http://') !== false) {
        $url_array[$title] = $url;
    } else {
        $string_array[$title] = $url;
    }
    $keys[] = $title;
}
$url_array = array_unique($url_array);
$urls = array_merge($url_array, $string_array);
$urls = array_sub_sort($urls, $keys);

这是数组子排序功能代码。

function array_sub_sort(array $values, array $keys){
    $keys = array_flip($keys);
    return array_merge(array_intersect_key($keys, $values), array_intersect_key($values, $keys));
}

最新更新