r语言 - 使用 ralger 进行网页抓取"scrap"返回空值



我确实试图刮,但是当我运行代码以获得工作描述时,我得到部分结果示例,总工作= 1500,总链接= 1500,描述=小于1500,有时当我运行特定块以获得描述结果也会发生变化。我将感谢你的帮助,知道如何获得所有的值或如何改变那些缺失的结果为NA。

library(ralger)
#Search Method
base_link <- "https://www.indeed.com/jobs?q&l=mexico&from=searchOnHP&vjk=c339451b33a29c91"
links <- paste0(base_link, 1:100)
#Getting link
scraped_url<- attribute_scrap(links, node = '[data-hide-spinner = "true"]', attr = 'href')
job_url <- paste0("https://www.indeed.com",scraped_url)
#Getting Job Description
job_description <- scrap(link = job_url, node = '.jobsearch-jobDescriptionText')
#Creating Data Frame
df <- data.frame(job_description,job_url)

Error in data.frame(fullds, job_description, job_url) : 
arguments imply differing number of rows: 1500, 1485

我已经能够用以下代码提取RSelenium的职位描述。我认为你不能提取网站的所有信息与R包较大,因为页面没有完全加载在你提取信息的时刻。RSelenium允许在我们提取网站信息时加载页面。我在下面为一个链接添加了一个示例。

library(RSelenium)
library(rvest)
shell('docker run -d -p 4446:4444 selenium/standalone-firefox')
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4446L, browserName = "firefox")
remDr$open()
base_link <- "https://www.indeed.com/jobs?q&l=mexico&from=searchOnHP&vjk=c339451b33a29c91"
links <- paste0(base_link, 1)
job_Url <- list()
remDr$navigate(links)  
for(i in 1 : 200)
{
print(i)
java_Script <- paste0("scroll(0,", i * 20, ")")
remDr$executeScript(java_Script)
}
counter <- 0
for(i in 1 : 30)
{
print(i)
xpath <- paste0('/html/body/main/div/div[1]/div/div/div[5]/div[1]/div[5]/div/ul/li[', i, ']/div/div[1]/div/div[1]/div/table[1]/tbody/tr/td/div[1]/h2/a')
web_Obj <- tryCatch(remDr$findElement("xpath", xpath), error = function(e) NA)

if(is.na(web_Obj))
{
}else
{
counter <- counter + 1
job_Url[[counter]] <- web_Obj$getElementAttribute("href")[[1]]
print(job_Url[[counter]])
}  
}
nb_Job_Url <- length(job_Url)
list_Text_Job_Description <- list()
for(i in 1 : nb_Job_Url)
{
print(i)
remDr$navigate(job_Url[[i]])
Sys.sleep(2)
web_Obj_Job_Description <- remDr$findElement('id', "jobDescriptionText") 
list_Text_Job_Description[[i]] <-  web_Obj_Job_Description$getElementText()
}

list_Text_Job_Description[[1]]
[1] "Job DescriptionnCiudad de México, MéxiconNivel de EstudiosnBachelor´s degreenExperiencia Requeridan3+n3+ years office administration experience with senior level management and front desk experience.nResumennThe receptionist will perform a variety of tasks to directly support daily activities for the Mexico City Beer division office and said front desk responsibilities, including but not limited to answering the main telephone lines, routing calls, greeting visitors, ordering supplies, managing couriers and cross training with the Facilities Supervisor to assist when needed on other office related needs and asks of the Facilities team.nHabilidadesnTelephone SkillsnAssertive Communication & Listening skill.nCustomer Service Attitude.nExcellent service and positive attitude.nWillingness to help everyone.nResilience & Professionalism.nStrong verbal and written communication.nStrong analytical and problem-solving skills.nLearning agility.nOrganized.nResponsabilidadesn1.Responsible for the Front Desk (manages 2.5 floors for + 200 employees) from 8am to 5pm, (with a 1-hour lunch) Monday to Thursday, and 8am to 1:00pm on Friday. Activities to include routing incoming calls, greeting visitors, sending/receiving couriers, and parcel carrier packages, overseeing vendor access and issuing parking ticket vouchers as requested internally or externally.n2.Greet visitors, track and manage visitors through logbooks or electronic system, notify employees of visitor arrivals, provide visitors with a positive experience (i.e. coffee, water, take coats, etc.) and work with building security on visitor access as applicablen3.Assist Security in the administration of the access cards for employees in Mexico City office, to include printing and deliver of cards to employees, maintenance of an active card inventory summary, ensuring compliance with corporate Security policy.n4.Issue new hire welcoming e- mail to include guidance about local Product Allowance program, stationary ordering, parking regulation, etc. Responsibilities to include maintenance and edits to electronic guide as directed by Human Resources or Facilities.n5.Assist local administrative assistants, as needed, with on-site meeting conference room reservations and scheduling to ensure rooms are ready for meetings (i.e. proper number of chairs, clean/ready to use room, etc.). Work with local IT team to ensure AV is functioning and ready for meeting use.n6.Responsible for inventory and ordering of office supplies and bar products.n7.Place vendor service calls and issue Ariba purchase orders as required for all maintenance and service required for equipment, goods and other maintenance services, as directed by the Facilities Supervisor.n8.Assist in the delivery of employee product allowance orders, business cards, or other seasonal employee gifts.n9.Completes a variety of responsibilities, administrative duties and special projects as assigned by Facilities Management.nLocationnMexico CitynAdditional LocationsnJob TypenFull timenJob AreanOperations and ProductionnEqual OpportunitynConstellation Brands is committed to a continuing program of equal employment opportunity. All persons have equal employment opportunities with Constellation Brands, regardless of their sex, race, color, age, religion, creed, sexual orientation, national origin or citizenship, ancestry, physical or mental disability, medical condition (cancer or genetic characteristics), marital status, gender (including gender identity or gender expression), familial status, military or veteran status, genetic information, pregnancy, childbirth, breastfeeding, or related conditions (or any other group or category within the framework of the applicable discrimination laws and regulations)."