PHP -查找文章中的关键字集



这个概念是我有一个关键字数组和一篇文章。我想知道在考虑性能和速度的情况下,什么是最好的方法来发现这些关键字是否出现在一组文章中。

基本上,关键字由3个或以上的单词组成,但不超过10个单词。它将查看关键字是否存在于文章中,然后它将只返回在文章中找到的关键字。

假设有一篇文章:

$articles = "Maybe it’s less true than it used to be that people are made of 
       place--that the same elements that form coal and clay and bogs and ice form 
       faces, voices and characters. I wrote my first collection of short stories, 
       The Bostons, in homage to this book, hoping, as did Joyce’s young Stephen 
       Dedalus, to encounter for the millionth time the reality of experience and to 
       forge in the smithy of my soul the uncreated conscience of some island-dwellers
       I knew." 

关键词:

$keywords = "less true than, people are made, smithy of my soul, uncreated 
             conscience, this is a test string"

输出必须为:

"less true than, people are made, smithy of my soul, uncreated conscience"

我已经用

编程了
  $articles = mb_split( ' +', $articles );
  foreach ( $articles as $key => $word )
 $articles [$key] = trim($word);
  //Search for keywords     
  $keywords = str_replace(' ', '', $keywords);
  $keywords =  mb_split( '[ ,]+', mb_strtolower( $keywords, 'utf-8' ) );
  $result = implode(',', array_intersect($keywords, $articles );

,但它只适用于一个关键字。我不知道如何通过多个关键词来做到这一点。

strpos()是您所需要的。这行得通-

$res = Array();
foreach(explode(", ",$keywords) as $keyword){
    if(strpos($articles, $keyword)){
        $res[] = $keyword;
    }
}
$matched = implode($res,", ");
var_dump($matched);
/** OUTPUT **/
string 'less true than, people are made, smithy of my soul, uncreated conscience' (length=72)
$matches = array_unique(
    preg_match_all(
        '/'.implode('|', explode(', ', $keywords).'/',
        $articles
    )
);

正则表达式可以帮助您。正如你在这里看到的,这是有效的。您的问题可能是关键字字符串的中断?

$articles = "Maybe it’s less true than it used to be that people are made of 
   place--that the same elements that form coal and clay and bogs and ice form 
   faces, voices and characters. I wrote my first collection of short stories, 
   The Bostons, in homage to this book, hoping, as did Joyce’s young Stephen 
   Dedalus, to encounter for the millionth time the reality of experience and to 
   forge in the smithy of my soul the uncreated conscience of some island-dwellers
   I knew.";
$keywords = "less true than, people are made, smithy of my soul, uncreated conscience, this is a test string";
$keywordsArray = explode(', ',$keywords);
$pattern = '/'.implode('|',$keywordsArray).'/';
preg_match_all($pattern,$articles,$matches);
var_dump($matches);

$articles ="也许这句话不像以前那么正确了地方——形成煤、粘土、沼泽和冰的相同元素面孔、声音和角色。我写了我的第一本短篇小说集,波士顿人,为了向这本书致敬,满怀希望,就像乔伊斯笔下年轻的斯蒂芬一样德达罗斯,第一百万次遇到现实的经验在我灵魂的铁匠铺里,锻造着某些岛民的天生的良心我知道。"

;

$keywords = "less true than, people are made, smith of my soul, uncreated conscience, this is a test string";

$keyword = explosion (',',$keywords);

foreach($key =>$value) {

if(strpos($articles,$value)) {
      $finalstring  .= $value.',';
 }   
}

echo $ finalstring;

最新更新