我试图用rvest
发送搜索请求,但我总是得到相同的错误。我已经尝试了几种方法包括这个解决方案:https://gist.github.com/ibombonato/11507d776d1042f80ca59cd31509afd3
我的代码如下:
library(rvest)
url <- 'https://www.saferproducts.gov/PublicSearch'
cocorahs <- html_session(URL)
form.unfilled <- cocorahs %>% html_node("form") %>% html_form()
form.unfilled[["fields"]][[3]][["value"]] <- "input" ## This is the line which I think should be corrected
form.filled <- form.unfilled %>%
set_values("searchParameter.AdvancedKeyword" = "amazon")
session1 <- session_submit(cocorahs, form.filled, submit = NULL)
# or
session <- submit_form(cocorahs, form.filled)
但是我总是得到以下错误:
Error in `submission_build()`:
! `form` doesn't contain a `action` attribute
Run `rlang::last_error()` to see where the error occurred.
我认为方法是编辑这些按钮的属性。也许有人知道答案。提前谢谢。
httr2
的替代方法
library(tidyverse)
library(rvest)
library(httr2)
data <- "https://www.saferproducts.gov/PublicSearch" %>%
request() %>%
req_body_form(
"searchParameter.Keyword" = "Amazon"
) %>%
req_perform() %>%
resp_body_html()
tibble(
title = data %>%
html_elements(".document-title") %>%
html_text2(),
report_title = data %>%
html_elements(".info") %>%
html_text2() %>%
str_remove_all("r") %>%
str_squish()
)
#> # A tibble: 10 × 2
#> title repor…¹
#> <chr> <chr>
#> 1 Self balancing scooter was used off & on for three years. Consumer i… Incide…
#> 2 The consumer stated that when he opened one of the marshmallow roast… Incide…
#> 3 The consumer, 59, stated that he was welding with a brand new auto d… Incide…
#> 4 The consumer reported, that their hover soccer toy caught fire while… Incide…
#> 5 80 yr old male's electric hotplate was set between 1 and 2(of 5) bef… Incide…
#> 6 Amazon Recalls Amazon Basics Desk Chairs Due to Fall and Injury Haza… Recall…
#> 7 The Consumer reported to have been notified by email that the diarrh… Incide…
#> 8 consumer reported about light fixture attached to a photography umbr… Incide…
#> 9 Drive DeVilbiss Healthcare Recalls Adult Portable Bed Rails After Tw… Recall…
#> 10 MixBin Electronics Recalls iPhone Cases Due to Risk of Skin Irritati… Recall…
#> # … with abbreviated variable name ¹report_title
创建于2023-01-15与reprex v2.0.2