Symfony Dom-crawler未在laravel 9中找到



所以我有这个url $url = "localhost:8000/vehicles"我想通过cron作业获取但页面返回的是HTML所以我想用symfony dom crawler获取所有车辆而不是正则表达式

在我的文件的顶部我添加了

use SymfonyComponentDomCrawlerCrawler;

创建一个新的实例,我试了:

$crawler = new Crawler($data);

and I tried

$crawler = Crawler::create($data);

,但这给了我一个错误,也尝试添加

SymfonyComponentDomCrawlerCrawler::class,

到服务提供者,但是当我执行命令:

composer dump-autoload它给了我以下错误

In Crawler.php line 66:
SymfonyComponentDomCrawlerCrawler::__construct(): Argument #1 ($node) must be of type DOMNodeList|DOMNode|array|string|null, IlluminateFoundationApplication given, called in C:xampphtdocsDrostMachinehandelDrostMachinehandelvendorlaravelfr   
ameworksrcIlluminateFoundationProviderRepository.php on line 208

Script @php artisan package:discover --ansi handling the post-autoload-dump event returned with error code 1

我不知道如何解决这个问题。

获取url的函数如下:

public function handle()
{
$url = SettingsController::fetchSetting("fetch:vehicles");
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$data = curl_exec($ch);
$vehicles = $this->scrapeVehicles($data, $url);
Log::debug($vehicles);
curl_close($ch);
}

private function scrapeVehicles(string $data, string $url): array
{
$crawler = Crawler::create($data);
$vehicles = $crawler->filter(".vehicleTile");
return $vehicles;
}

$data的内容:

https://pastebin.com/GJ300KEv

由于没有经过测试,我不确定。

确保你安装了正确的软件包

composer require symfony/dom-crawler

初始化,使用全路径。(因为它不是Laravel的方式(包))

$crawler = SymfonyComponentDomCrawlerCrawler::create($data);

composer require——dev" symfony/dom-crawler"; "^6.3.x-dev">

示例爬虫方法名称空间的应用程序 Http 控制器;

use GuzzleHttpExceptionGuzzleException;
use SymfonyComponentDomCrawlerCrawler;
class CrawlerController extends Controller
{
private $url;
public function __construct()
{
$this->url = "https://www.everything5pounds.com/en/Shoes/c/shoes/results?q=&page=6";
}
public function index()
{
$client = new GuzzleHttpClient();
try {
$response = $client->request('GET', $this->url);
if ($response->getStatusCode() == 200) {
$res = json_decode($response->getBody());
$results = $res->results;
return $results;
/*$results = (array)json_decode($res);

$products = array();
foreach ($results as $result) {
$product = [
"name" => $result["name"],
];
array_push($products, $product);
}
return $products;*/
//return $result;
//return $this->parseContent($result);

} else {
return $response->getReasonPhrase();
}
} catch (GuzzleException $e) {
return $e->getMessage();
}
}

解析内容并存储

public function parseContent($result)
{
$crawler = new Crawler($result);
$elements = $crawler->filter('.productGridItem')->each(function (Crawler $node, $i) {
return $node;
});
$products = array();
foreach ($elements as $item) {
$image = $item->filter('.thumb .productMainLink img')->attr('src');
$title = $item->filter('.productGridItem .details')->text();
$price = $item->filter('.productGridItem .priceContainer')->text();
$product = [
"image" => 'https:' . $image,
"title" => $title,
"price" => $price,
];
array_push($products, $product);
}
return $products;
}

相关内容

  • 没有找到相关文章

最新更新