我一直在使用DOMXPATH,我喜欢它,但我需要它更直观一点。一些客户在他们的代码中添加了一些额外的HTML,这会破坏我们的项目。
示例1:
<div id="Fooen">
<span class="FooTitle">Overdracht</span>
<span class="Foo koopprijs">
<span class="FooName">Vraagprijs</span>
<span class="FooValue">€ 299.000,-</span>
</span>
<span class="Foo aanvaarding">
<span class="FooName">Aanvaarding</span>
<span class="FooValue">In overleg</span>
</span>
</div>
我们可以通过以下方式获得SPAN名称和值:
$filtered = $domxpath->query("//div[@class='Fooen']/span");
foreach ($filtered as $myItem) {
$temp_name = $domxpath->evaluate("string(descendant::span[@class='FooName'])", $myItem);
$name = strtolower(preg_replace('/s*/', '', $temp_name));
$value = $domxpath->evaluate("string(descendant::span[@class='FooValue'])", $myItem);
}
但是,有时客户端添加了代码,所以节点现在更深了。如果不把它一路映射下来,我似乎找不到答案。
示例2:
<div id="Fooen">
<div>
<div class="blok-sizer"></div>
<div id="" class="block">
<div class="top">
<div class="center column"></div>
</div>
<div class="middle">
<div class="center column">
<span class="FooTitle">Overdracht</span>
<span class="Foo first transactiestatus">
<span class="FooName">Status</span>
<span class="FooValue">Beschikbaar</span>
</span>
<span class="Foo koopprijs">
<span class="FooName">Vraagprijs</span>
<span class="FooValue">€ 975.000,-</span>
</span>
</div>
</div>
</div>
</div>
</div>
但现在,这行不通:
$filtered = $domxpath->query("//div[@class='Fooen']/span");
foreach ($filtered as $myItem) {
$temp_name = $domxpath->evaluate("string(descendant::span[@class='FooName'])", $myItem);
$name = strtolower(preg_replace('/s*/', '', $temp_name));
$value = $domxpath->evaluate("string(descendant::span[@class='FooValue'])", $myItem);
}
我尝试过这样的变体:
$domxpath->evaluate("string(descendant::*[@class='FooName'])", $myItem);
$domxpath->evaluate("string(//*[@class='FooName'])", $myItem);
$domxpath->evaluate("string(*[@class='FooName'])", $myItem);
$domxpath->evaluate("string(.//span[@class='FooName'])", $myItem);
有没有一种方法可以得到字符串的结果,即使它每次都不在同一个位置,从而更灵活?
编辑,这是我目前正在使用的一个现成的复制/粘贴示例。第一个是工作的,第二个是我想从头至尾工作的,不是固定的,而是灵活的。如果我知道怎么拉小提琴,我会的,对不起。
<?php
function getDom($url = "")
{
$str = $url;
$internalErrors = libxml_use_internal_errors(true);
$dom = new DOMDocument('1.0', 'UTF-8');
$dom->loadHTML($str);
libxml_use_internal_errors($internalErrors);
return $dom;
}
$domcode = '<div class="Fooen">
<span class="FooTitle">Overdracht</span>
<span class="Foo koopprijs">
<span class="FooName">Vraagprijs</span>
<span class="FooValue">€ 299.000,-</span>
</span>
<span class="Foo aanvaarding">
<span class="FooName">Aanvaarding</span>
<span class="FooValue">In overleg</span>
</span>
</div>';
$dom = getDom($domcode);
$html = '';
$domxpath = new DOMXPath($dom);
$newDom = new DOMDocument;
$newDom->formatOutput = true;
$filtered = $domxpath->query("//div[@class='Fooen']/span");
foreach ($filtered as $myItem) {
$temp_name = $domxpath->evaluate("string(descendant::span[@class='FooName'])", $myItem);
echo strtolower(preg_replace('/s*/', '', $temp_name));
echo " = ";
echo $domxpath->evaluate("string(descendant::span[@class='FooValue'])", $myItem);
echo "<br>";
}
echo "<br>";
$domcode = '
<div class="Fooen">
<div>
<div class="blok-sizer"></div>
<div id="" class="block">
<div class="top">
<div class="center column"></div>
</div>
<div class="middle">
<div class="center column">
<span class="FooTitle">Overdracht</span>
<span class="Foo first transactiestatus">
<span class="FooName">Status</span>
<span class="FooValue">Beschikbaar</span>
</span>
<span class="Foo koopprijs">
<span class="FooName">Vraagprijs</span>
<span class="FooValue">€ 975.000,-</span>
</span>
</div>
</div>
</div>
</div>
</div>';
$dom = getDom($domcode);
$html = '';
$domxpath = new DOMXPath($dom);
$newDom = new DOMDocument;
$newDom->formatOutput = true;
$filtered = $domxpath->query("//div[@class='center column']/span");
foreach ($filtered as $myItem) {
$temp_name = $domxpath->evaluate("string(descendant::span[@class='FooName'])", $myItem);
echo "<br>";
echo strtolower(preg_replace('/s*/', '', $temp_name));
echo " = ";
echo $domxpath->evaluate("string(descendant::span[@class='FooValue'])", $myItem);
}
原来我一整天都在打错误的代码行。显然,我需要扩大过滤搜索范围。如果有非贪婪代码的空间,我会洗耳恭听。否则,我希望它能帮助其他人。
$filtered = $domxpath->query("//div[@class='Fooen']/descendant::span");