DOMXPATH评估具有可变嵌套位置的字符串



我一直在使用DOMXPATH,我喜欢它,但我需要它更直观一点。一些客户在他们的代码中添加了一些额外的HTML,这会破坏我们的项目。

示例1:

<div id="Fooen">
<span class="FooTitle">Overdracht</span>
<span class="Foo  koopprijs">
<span class="FooName">Vraagprijs</span> 
<span class="FooValue">€ 299.000,-</span>
</span>
<span class="Foo  aanvaarding">
<span class="FooName">Aanvaarding</span> 
<span class="FooValue">In overleg</span>
</span>
</div>

我们可以通过以下方式获得SPAN名称和值:

$filtered           = $domxpath->query("//div[@class='Fooen']/span");
foreach ($filtered as $myItem) {
$temp_name = $domxpath->evaluate("string(descendant::span[@class='FooName'])", $myItem);            
$name      = strtolower(preg_replace('/s*/', '', $temp_name));
$value     = $domxpath->evaluate("string(descendant::span[@class='FooValue'])", $myItem);           
}

但是,有时客户端添加了代码,所以节点现在更深了。如果不把它一路映射下来,我似乎找不到答案。

示例2:

<div id="Fooen">
<div>
<div class="blok-sizer"></div>
<div id="" class="block">
<div class="top">
<div class="center column"></div>
</div>
<div class="middle">
<div class="center column">
<span class="FooTitle">Overdracht</span>
<span class="Foo first transactiestatus">
<span class="FooName">Status</span>
<span class="FooValue">Beschikbaar</span>
</span>
<span class="Foo  koopprijs">
<span class="FooName">Vraagprijs</span> 
<span class="FooValue">€ 975.000,-</span>
</span>
</div>
</div>
</div>
</div>
</div>

但现在,这行不通:

$filtered           = $domxpath->query("//div[@class='Fooen']/span");
foreach ($filtered as $myItem) {
$temp_name = $domxpath->evaluate("string(descendant::span[@class='FooName'])", $myItem);            
$name      = strtolower(preg_replace('/s*/', '', $temp_name));
$value     = $domxpath->evaluate("string(descendant::span[@class='FooValue'])", $myItem);           
}

我尝试过这样的变体:

$domxpath->evaluate("string(descendant::*[@class='FooName'])", $myItem);            
$domxpath->evaluate("string(//*[@class='FooName'])", $myItem);            
$domxpath->evaluate("string(*[@class='FooName'])", $myItem); 
$domxpath->evaluate("string(.//span[@class='FooName'])", $myItem); 

有没有一种方法可以得到字符串的结果,即使它每次都不在同一个位置,从而更灵活?

编辑,这是我目前正在使用的一个现成的复制/粘贴示例。第一个是工作的,第二个是我想从头至尾工作的,不是固定的,而是灵活的。如果我知道怎么拉小提琴,我会的,对不起。

<?php
function getDom($url = "")
{
$str            = $url;
$internalErrors = libxml_use_internal_errors(true);
$dom            = new DOMDocument('1.0', 'UTF-8');
$dom->loadHTML($str);
libxml_use_internal_errors($internalErrors);
return $dom;
}
$domcode = '<div class="Fooen">
<span class="FooTitle">Overdracht</span>
<span class="Foo  koopprijs">
<span class="FooName">Vraagprijs</span> 
<span class="FooValue">€ 299.000,-</span>
</span>
<span class="Foo  aanvaarding">
<span class="FooName">Aanvaarding</span> 
<span class="FooValue">In overleg</span>
</span>
</div>';
$dom                  = getDom($domcode);
$html                 = '';
$domxpath             = new DOMXPath($dom);
$newDom               = new DOMDocument;
$newDom->formatOutput = true;

$filtered = $domxpath->query("//div[@class='Fooen']/span");
foreach ($filtered as $myItem) {
$temp_name = $domxpath->evaluate("string(descendant::span[@class='FooName'])", $myItem);
echo strtolower(preg_replace('/s*/', '', $temp_name));
echo " = ";
echo $domxpath->evaluate("string(descendant::span[@class='FooValue'])", $myItem);
echo "<br>";
}

echo "<br>";
$domcode = '
<div class="Fooen">
<div>
<div class="blok-sizer"></div>
<div id="" class="block">
<div class="top">
<div class="center column"></div>
</div>
<div class="middle">
<div class="center column">
<span class="FooTitle">Overdracht</span>
<span class="Foo first transactiestatus">
<span class="FooName">Status</span>
<span class="FooValue">Beschikbaar</span>
</span>
<span class="Foo  koopprijs">
<span class="FooName">Vraagprijs</span> 
<span class="FooValue">€ 975.000,-</span>
</span>
</div>
</div>
</div>
</div>
</div>';

$dom                  = getDom($domcode);
$html                 = '';
$domxpath             = new DOMXPath($dom);
$newDom               = new DOMDocument;
$newDom->formatOutput = true;
$filtered = $domxpath->query("//div[@class='center column']/span");
foreach ($filtered as $myItem) {
$temp_name = $domxpath->evaluate("string(descendant::span[@class='FooName'])", $myItem);
echo "<br>";
echo strtolower(preg_replace('/s*/', '', $temp_name));
echo " = ";
echo $domxpath->evaluate("string(descendant::span[@class='FooValue'])", $myItem);
}

原来我一整天都在打错误的代码行。显然,我需要扩大过滤搜索范围。如果有非贪婪代码的空间,我会洗耳恭听。否则,我希望它能帮助其他人。

$filtered = $domxpath->query("//div[@class='Fooen']/descendant::span");

最新更新