使用DOM爬虫从url获取元标签

我已经在我的项目中安装了symfony/dom-crawler。我正试图从一些随机站点的URL获得一些元标签进行测试。

$url = 'https://www.lala.rs/fun/this-news';
$crawler = new Crawler($url);
$data = $crawler->filterXpath("//meta[@name='description']")->extract(array('content'));

并且它总是返回[]作为结果。

我已经尝试过基本的meta描述，但也许我不理解它的正确。我检查了Symfony文档，但没有找到正确的方法。

您需要将HTML内容传递给new Crawler($html)，而不是URL。

在此页面上使用viewport可以正常工作，因为缺少description。

<meta name="viewport" content="width=device-width, height=device-height, initial-scale=1.0, minimum-scale=1.0">

$url = 'https://stackoverflow.com/questions/66494027/get-meta-tags-from-url-with-dom-crawler';
$html = file_get_contents($url);
$crawler = new Crawler($html);
$data = $crawler->filterXpath("//meta[@name='viewport']")->extract(['content']);

Array
(
[0] => width=device-width, height=device-height, initial-scale=1.0, minimum-scale=1.0
)

相关内容

最新更新

热门标签：