PHP Regex批处理更新



简而言之,我想谈论我的问题;

$text = 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industrys standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.';
$text = preg_replace('#(?<!((alt|src)="))Lorem(?!(.*("|</a>)))#i', '<a href="Lorem" title="Lorem" style="color: inherit;"></a>', $text);
$text = preg_replace('#(?<!((alt|src)="))Ipsum(?!(.*("|</a>)))#i', '<a href="Ipsum" title="Ipsum" style="color: inherit;"></a>', $text);
echo $text;

" lorem "更改,但" ipsum "不更改。

以上PHP的结果:

<a href="Lorem" title="Lorem" style="color: inherit;">Lorem</a> Ipsum is simply dummy text of the printing and typesetting industry. <a href="Lorem" title="Lorem" style="color: inherit;">Lorem</a> Ipsum has been the industrys standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing <a href="Lorem" title="Lorem" style="color: inherit;">Lorem</a> Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of <a href="Lorem" title="Lorem" style="color: inherit;">Lorem</a> <a href="Ipsum" title="Ipsum" style="color: inherit;">Ipsum</a>.

为什么不" ipsum "更改?

编辑:

如果您评论第一个preg_replace行 - 曾经为 - 第二个preg_replace可以正常工作。 php小提琴1 命中F9

另外,如果您交换了两个preg_replace的位置,您将获得" ipsum "更换但不是" lorem "/strong>

so,如果这两个单词最初不在锚标签中第二个 preg_replace ,否则两个外观条件将为true PHP小提琴3 ( 1 (


更新:

正如OP在评论中提到的那样,当使用上述内容时,如果字符串 $text 具有相同的标准单词,类似于:

 <a href="">test Lorem test</a>

在这种情况下,单独使用正则是恕我直言,我们需要执行以下操作:

  1. 检查锚标签的任何出现 <a> 在字符串 $text 中。
  2. 使用数组 $tempArr 作为存储链接元素的临时存储。
  3. 用具有不同形式的文本替换每个链接元素,数字作为唯一ID,最终结果: tempRep#0 tempRep#1 ..代替每个链接元素。
  4. 运行REGEX语句( 2 (
  5. 现在我们在步骤#3中扭转了该过程,我们替换 tempRep#0 tempRep#1 ..等 $tempArr ,将每个唯一ID中的数字与相同的数组索引号( 3 (匹配。

可以使用JavaScript来实现上述算法,因为我们需要一些文档对象模型检查,但是正如OP所说,JavaScript不是一个选项,因此我们需要通过加载 PHP Document Object Model 来使用加载字符串 $text as html,并使用以下php dom命令: getElementsByTagName() getAttribute() CC_s> CC_s> CC_s>另外, nodeValue (。

所以最后,我们有以下内容:

php小提琴4 [final]

$text = 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industrys standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of <a href="link1href" title="test1">test Ipsum Lorem test</a> Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of <a href="link2href" title="test2">test Lorem test</a> Lorem Ipsum.';
$dom = new DOMDocument;
$dom->loadHTML($text);
$tempArr = array();
$links = $dom->getElementsByTagName('a');
foreach ($links as $link) {  
    $href = $link->getAttribute('href');
    $title = $link->getAttribute('title');
    $textCont = $link->textContent; //Alternatively, $link->->nodeValue could be used too
    $linkElement = '<a href="' . $href . '" title="' . $title . '">' . $textCont . '</a>';
    $tempArr[] = $linkElement;
}
for($i=0; $i < count($tempArr); $i++){
    $text = str_replace($tempArr[$i], 'tempRep#' . $i, $text);
}
$text = preg_replace('#(?<!(alt|src)=")(Lorem|Ipsum)(?!(("|</a>)))#i', '<a href="" title="" style="color: inherit;"></a>', $text);
for($i=0; $i < count($tempArr); $i++){
    $text = str_replace('tempRep#' . $i, $tempArr[$i], $text);
}
echo $text;

注意:

  1. 我发现仅在第二个 preg_replace函数中 lookahead 条件是导致错误的原因,在此中 php小提琴5 ,我保留了look的范围,只删除了lookahead,奇怪的是它仍然可以正常工作。
  2. 我已经将2个正则语句合并为一个:

    $text = preg_replace('#(?<!(alt|src)=")(Lorem|Ipsum)(?!(("|</a>)))#i', '<a href="" title="" style="color: inherit;"></a>', $text);
    
  3. 这就是为什么我们为每个替换使用一个唯一的ID。

最新更新