r-使用rvest从html返回相同数量的元素



我正在尝试使用rvest抓取英国所有苹果商店的城市名称和地址

library(rvest)
library(xml2)
library(tidyverse)
my_url <- read_html("https://www.apple.com/uk/retail/storelist/")
# extract city name 
city_name <- my_url %>% html_elements("h2") %>% html_text2()
length(city_name)
# 27 cities
address <- my_url %>% html_elements("address") %>% html_text2()
length(address)
# 38 addresses

我收到的地址比城市名称还多。这是因为一些城市多个商店。如何获得相同号码的城市名称和地址,以便我可以把它们放在数据帧中?

您可以进行

library(rvest)
library(xml2)
library(tidyverse)
read_html("https://www.apple.com/uk/retail/storelist/") %>% 
html_elements(xpath = "//div[@class='state']") %>%
lapply(function(x) {
data.frame(city = html_element(x, "h2") %>% html_text(), 
address = html_elements(x, "address") %>% html_text2())}) %>%
do.call(rbind, .) %>%
as_tibble()
#> # A tibble: 38 x 2
#>    city            address                                                      
#>    <chr>           <chr>                                                        
#>  1 Aberdeen        "27/28 Ground Level MallnUnion SquarenAberdeen , AB11 ~
#>  2 Antrim          "Upper Ground Floorn1 Victoria SquarenBelfast , BT1 4Q~
#>  3 Berkshire       "The Oracle Shopping CentrenUpper LevelnReading , RG1 ~
#>  4 Bristol         "11 Philadelphia StreetnQuakers FriarsnBristol , BS1 3~
#>  5 Bristol         "Upper MallnThe Mall at Cribbs CausewaynBristol , BS34~
#>  6 Buckinghamshire "26 Midsummer PlacenMidsummer BoulevardnMilton Keynes ~
#>  7 Cambridgeshire  "Grand Arcade Shopping CentrenCambridge , CB2 3AXn0122~
#>  8 Cardiff         "63-66 Grand ArcadenSt David’s Dewi SantnCardiff , CF1~
#>  9 Central London  "No. 1-7 The PiazzanLondon , WC2E 8HBn020 7447 1400"    
#> 10 Central London  "235 Regent StreetnLondon , W1B 2ELn020 7153 9000"      
#> # ... with 28 more rows

创建于2022-04-12由reprex包(v2.0.1(

最新更新