我正在尝试使用rvest
抓取英国所有苹果商店的城市名称和地址
library(rvest)
library(xml2)
library(tidyverse)
my_url <- read_html("https://www.apple.com/uk/retail/storelist/")
# extract city name
city_name <- my_url %>% html_elements("h2") %>% html_text2()
length(city_name)
# 27 cities
address <- my_url %>% html_elements("address") %>% html_text2()
length(address)
# 38 addresses
我收到的地址比城市名称还多。这是因为一些城市多个商店。如何获得相同号码的城市名称和地址,以便我可以把它们放在数据帧中?
您可以进行
library(rvest)
library(xml2)
library(tidyverse)
read_html("https://www.apple.com/uk/retail/storelist/") %>%
html_elements(xpath = "//div[@class='state']") %>%
lapply(function(x) {
data.frame(city = html_element(x, "h2") %>% html_text(),
address = html_elements(x, "address") %>% html_text2())}) %>%
do.call(rbind, .) %>%
as_tibble()
#> # A tibble: 38 x 2
#> city address
#> <chr> <chr>
#> 1 Aberdeen "27/28 Ground Level MallnUnion SquarenAberdeen , AB11 ~
#> 2 Antrim "Upper Ground Floorn1 Victoria SquarenBelfast , BT1 4Q~
#> 3 Berkshire "The Oracle Shopping CentrenUpper LevelnReading , RG1 ~
#> 4 Bristol "11 Philadelphia StreetnQuakers FriarsnBristol , BS1 3~
#> 5 Bristol "Upper MallnThe Mall at Cribbs CausewaynBristol , BS34~
#> 6 Buckinghamshire "26 Midsummer PlacenMidsummer BoulevardnMilton Keynes ~
#> 7 Cambridgeshire "Grand Arcade Shopping CentrenCambridge , CB2 3AXn0122~
#> 8 Cardiff "63-66 Grand ArcadenSt David’s Dewi SantnCardiff , CF1~
#> 9 Central London "No. 1-7 The PiazzanLondon , WC2E 8HBn020 7447 1400"
#> 10 Central London "235 Regent StreetnLondon , W1B 2ELn020 7153 9000"
#> # ... with 28 more rows
创建于2022-04-12由reprex包(v2.0.1(