PHP XPath评估重复数据只得到第一行



这是我的PHP代码:

<?php
error_reporting(E_ALL);
ini_set("display_errors",1);
ini_set('max_execution_time', 36000); //300 seconds = 5 minutes
$url = 'http://www.sportstats.com/soccer/matches/20170815/';
libxml_use_internal_errors(true); 
$doc = new DOMDocument();
$doc->loadHTMLFile($url);
$xpath = new DOMXpath($doc);

$data = array(
'HomeTeam' => $xpath->evaluate('string(//td[@class="table-home"]/a)'),
'AwayTeam' => $xpath->evaluate('string(//td[contains(@class, "table-away")]/a)'),
'FtScore' => $xpath->evaluate('string(normalize-space(translate(//td[@class="result-neutral"]," " ,"")))'),
'HomeTeamid' => $xpath->evaluate('substring-before(substring-after(substring-after(//td[@class="table-home"]/a/@href, "/soccer/"),"-"),"/")'),
'AwayTeamid' => $xpath->evaluate('substring-before(substring-after(substring-after(//td[@class="table-away"]/a/@href, "/soccer/"),"-"),"/")')
);
foreach ($data as $key) {
echo $data['HomeTeamid'].",";
echo $data['HomeTeam'].",";
echo $data['FtScore'].",";
echo $data['AwayTeam'].",";
echo $data['AwayTeamid']."<br/>";
}
?>

但是脚本给出了重复的结果:

n3QdnjFB,Santos,0-0,Fluminense,EV9L3kU4
n3QdnjFB,Santos,0-0,Fluminense,EV9L3kU4
n3QdnjFB,Santos,0-0,Fluminense,EV9L3kU4
n3QdnjFB,Santos,0-0,Fluminense,EV9L3kU4
n3QdnjFB,Santos,0-0,Fluminense,EV9L3kU4

但我希望它看起来像...

HTeamid,Santos,0-0,Fluminense,ATeamid
HTeamid,Cartagena,1-0,Llaneros,ATeamid
HTeamid,Cerro Porteno,1-1,Libertad Asuncion,ATeamid
HTeamid,Operario,2-1,Maranhao,ATeamid
HTeamid,Emelec,2-0,Fuerza,ATeamid
...
..
.

匹配列表图像 我查看了网站上的其他问题,但没有找到答案 如何使用 echo 命令获取所有其他团队数据(我不想使用 var_dump(。谢谢。

这里有两个错误,你在位置路径中使用//td。这使得相对于文档的路径和字符串函数始终返回列表中第一个节点的文本内容。你总是得到第一场比赛。

获取列表数据的典型结构是:

foreach($xpath->evaluate($exprForItems) as $item) {
$detail = $xpath->evaluate($exprForDetail, $item);
}

一个更具体的例子:

$document = new DOMDocument();
$document->loadHtml($html);
$xpath = new DOMXpath($document);
$expressions = new stdClass();
// this is the expression for items - it returns a node list
$expressions->games = '//div[@id = "LS_todayMatchesContent"]/table/tbody/tr';
// this are detail expressions - they return a string
$expressions->home = 'string(td[@class = "table-home"]/a)';
$expressions->homeId = 'substring-before(substring-after(substring-after(td[@class="table-home"]/a/@href, "/soccer/"),"-"),"/")';
$expressions->away= 'string(td[@class = "table-away"]/a)';
foreach ($xpath->evaluate($expressions->games) as $game) {
var_dump(
[
$xpath->evaluate($expressions->home, $game),
$xpath->evaluate($expressions->homeId, $game),
$xpath->evaluate($expressions->away, $game)
]
);
}

输出:

array(3) {
[0]=>
string(6) "Santos"
[1]=>
string(8) "n3QdnjFB"
[2]=>
string(10) "Fluminense"
}
array(3) {
[0]=>
string(9) "Cartagena"
[1]=>
string(8) "6eofBSjQ"
[2]=>
string(8) "Llaneros"
}
//...

因此,只有 detail 表达式使用字符串函数,并且它们始终需要项节点作为上下文(第二个参数(。您必须小心使用上下文。

尝试像这样编辑你的 xpath 数组:

'HomeTeam' => $xpath->query('//td[@class="table-home"]/a'),
'AwayTeam' => $xpath->query('//td[contains(@class, "table-away")]/a'),
'FtScore' => $xpath->query('//td[@class="result-neutral"]'),
...

使用query和更改路径。

然后,您可以像这样回显结果:

foreach ($data as $dataKey => $dataValue) {
foreach ($dataValue as $key => $element) {
$nodes = $element->childNodes;  
foreach ($nodes as $node) { 
$tag = $node->nodeValue;
echo $dataKey.' - '.$key.' - '.$tag.'<br>';  //$dataKey and $key are just informative
}
}
echo '<br>';
}

对我来说,它列出了:

HomeTeam - 0 - Santos
HomeTeam - 1 - Cartagena
HomeTeam - 2 - Cerro Porteno
HomeTeam - 3 - Operario
HomeTeam - 4 - Boca Juniors
HomeTeam - 5 - Emelec
....
AwayTeam - 0 - Fluminense
AwayTeam - 1 - Llaneros
AwayTeam - 2 - Libertad Asuncion
AwayTeam - 3 - Maranhao
AwayTeam - 4 - Gimnasia y Tiro
AwayTeam - 5 - Fuerza A.
....

当然,如果你想要一些有意义的数据打印,你需要它以数组的形式收集。

希望这就是您正在寻找的答案:) 祝你有美好的一天!

最新更新