我在这里解决了下载谷歌搜索结果页面源代码的问题。这是代码:
<!DOCTYPE html>
<html>
<body>
<!-- this program saves source code of a website to an external file -->
<!-- the string there for the fake user agent can be found here: http://useragentstring.com/index.php -->
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.google.com/search?q=blue+car');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:88.0) Gecko/20100101 Firefox/88.0');
$html = curl_exec($ch);
if(empty($html)) {
echo "<pre>cURL request failed:n".curl_error($ch)."</pre>";
} else {
$myfile = fopen("file.txt", "w") or die("Unable to open file!");
fwrite($myfile, $html);
fclose($myfile);
}
?>
</body>
</html>
现在我希望有100个结果,而不是只有10个。如果我更改谷歌搜索设置,它对上面写的代码没有影响。搜索结果变量的数量存储在某个地方,在谷歌上搜索时它不是查询字符串的一部分。。。
请使用&num参数指定返回的记录数(&num=xx(
因此,对于您的情况,请更改
curl_setopt($ch, CURLOPT_URL, 'https://www.google.com/search?q=blue+car');
至
curl_setopt($ch, CURLOPT_URL, 'https://www.google.com/search?q=blue+car&num=100');