检查下一个节点 xpath



我被困在我的代码上,我需要转换这个简单的html

<p>
    Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
</p>
<p>
    <img src="xxxxx" />
</p>
<div class="sourceimg">azerrty</div>
<p>
    <img src="xxxxx">
</p>
<p>
    Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
</p>
<p>
    <img src="xxxxx">
</p>
<div class="sourceimg">qwerty</div>
<p>
    Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
</p>

<p>
    Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
</p>
<figure>
    <img src="xxxxx" />
    <figcaption>
        <cite>
            azerrty
        </cite>
    </figcaption>
</figure>
<p>
    Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
</p>
<figure>
    <img src="xxxxx">
</figure>
<figure>
    <img src="xxxxx">
    <figcaption>
        <cite>
            qwerty
        </cite>
    </figcaption>
</figure>
   <p>
    Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
</p>

我设法用图形标签包装 img,但我不知道如何检查下一个节点 ( <div class="sourceimg"> xxx </div> ( 它是否存在

这就是我所做的:

<?php
ini_set('display_errors', 1);
ini_set('display_startup_errors', 1);
error_reporting(E_ALL);
$html = <<<EOF
<p>
    Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
</p>
<p>
    <img src="xxxxx" />
</p>
<div class="sourceimg">azerrty</div>
<p>
    <img src="xxxxx">
</p>
<div class="sourceimg">azerrty</div>
<p>
    Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
</p>
<p>
    <img src="xxxxx">
</p>
<div class="sourceimg">qwerty</div>
<p>
    Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
</p>
EOF;
libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$matches = $xpath->query('//p//img');
if($matches->length > 0){
    foreach($matches as $node){
        $figure_node = $dom->createElement('figure');
        $node->parentNode->replaceChild($figure_node, $node);
        $figure_node->appendChild($node);

    }
}
$contenu = $dom->saveHTML();
echo $contenu;
?>

和输出:

<p>
    <figure><img src="xxxxx">
    </figure>
</p>
<div class="sourceimg">azerrty</div>
<p>
    <figure><img src="xxxxx">
    </figure>
</p>
<div class="sourceimg">azerrty</div>
<p>
    <figure><img src="xxxxx">
    </figure>
</p>
<div class="sourceimg">qwerty</div>

[更新的代码] 我将执行以下操作:

...
libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$matches = $xpath->query('//p//img');
$matchesDivs = $xpath->query('//div');
if($matches->length > 0 && $matchesDivs->length > 0){
    $divSeen = [];
    $step = -1;
    foreach($matches as $node){
         if($node->getElementsByTagName('img')->length == 0){
             continue;
         }             
         $step++;
         $figure_node = $dom->createElement('figure');
         $figure_node->appendChild($node->getElementsByTagName('img')[0]);
         $node->parentNode->replaceChild($figure_node, $node);
         if(!in_array($matchesDivs[$step]->nodeValue, $divSeen)){
            $figCaption_node = $dom->createElement('figcaption');
            $cite_node = $dom->createElement('cit',$matchesDivs[$step]->nodeValue);
            $figCaption_node->appendChild($cite_node);
            $figure_node->appendChild($figCaption_node);
            $divSeen[]=$matchesDivs[$step]->nodeValue;
         }
         $matchesDiv[$step]->parentNode->removeChild($matchesDiv[$step]);
    }
}
$contenu = $dom->saveHTML();
echo $contenu;
?>

在这里检查:eval.in

输出:

<p>
   Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
</p>
<figure><img src="xxxxx"><figcaption><cit>azerrty</cit></figcaption></figure>
<figure><img src="xxxxx"></figure>
<p>
  Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
</p>
<figure><img src="xxxxx"><figcaption><cit>qwerty</cit></figcaption></figure>