使用PHP从特定URL提取所有图库图像



我正试图从特定的URL中提取所有图像,并将其存储在特定的文件夹中。我试着做了一些研究,但我只能得到一些图片列表。

<?php
// $url_image = $_GET['url'];
$url_image = 'https://www.thebridesofoklahoma.com/wedding-inspiration/elegant-yet-modern/';
$homepage = file_get_contents($url_image);
preg_match_all("{<img\s*(.*?)src=('.*?'|".*?"|[^\s]+)(.*?)\s*/?>}ims", $homepage, $matches, PREG_SET_ORDER);
// print_r($matches);
foreach ($matches as $val) {
$pos = strpos($val[2],"/"); 
$link = substr($val[2],1,-1);
if($pos == 1)
echo "https://www.thebridesofoklahoma.com" . $link;
else
echo $link;
echo "<br>";
}
?>

有人能帮我获取仅在画廊中使用的所有图片url列表吗。请参阅此网页:https://www.thebridesofoklahoma.com/wedding-inspiration/elegant-yet-modern/

不要使用Regex来解析HTML。使用DomDocument。可用的图像被包装到一个div中,因此使用XPath很容易找到它们。

libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTMLFile('https://www.thebridesofoklahoma.com/wedding-inspiration/elegant-yet-modern/');
$img = new DOMXPath($doc);
foreach($img->query('//div/img') as $image) {
echo $image->getAttribute('src'), PHP_EOL;
}
https://images.thebridesofoklahoma.com/wp-content/uploads/2017/08/21161012/Logo-final-01-1024x339.jpg
https://images.thebridesofoklahoma.com/wp-content/uploads/profiles/1564/08234701/bl_bartending_horiz_reversed.jpg
https://images.thebridesofoklahoma.com/wp-content/uploads/2021/01/26203513/Screen-Shot-2021-01-26-at-8.32.49-PM-850x1024.png
https://images.thebridesofoklahoma.com/wp-content/uploads/profiles/65/10015531/static.squarespace.com_.jpg
https://images.thebridesofoklahoma.com/wp-content/uploads/2019/09/17131459/IMG_2157-1024x965.jpg
https://images.thebridesofoklahoma.com/wp-content/uploads/2021/01/04143016/kgc-photography-logo-copy-1024x479.png
https://images.thebridesofoklahoma.com/wp-content/uploads/2021/03/25104302/FB4A0746-420x290.jpg
https://images.thebridesofoklahoma.com/wp-content/uploads/2021/03/24120412/harvard-99-420x290.jpg
https://images.thebridesofoklahoma.com/wp-content/uploads/2021/03/22133337/Aaron_Snow_Photography_Hall_Wedding.AES_0256-420x290.jpg

最新更新