我使用RSelenium。下面的代码是一个Javascript命令。我在循环中使用了它,它在第一次迭代中运行良好。但是在第二次迭代时我遇到了一个问题。
下面是我的代码:
remDr$executeScript("window.setInterval(function() {window.scrollBy(0, 300);}, 100)", args = list())
我收到的错误是:
Error in out[[wInd]] : recursive indexing failed at level 3
下面是可复制的示例:
remDr <- remoteDriver(browserName = "chrome", nativeEvents=FALSE)
remDr$open()
url_site <-'https://www.aliexpress.com/category/1909/digital-camera.html?site=glo&pvId=351-350381&attrRel=or&isrefine=y'
remDr$navigate(url_site) # navigates to webpage
remDr$setImplicitWaitTimeout(10000000)
remDr$setTimeout(type = "page load", milliseconds = 10000000)
Sys.sleep(6)
# remDr$executeScript("document.getElementById('alibaba-login-box').getElementById('fm-login-id').value = 'tokenspy@gmail.com';alert();", args = list())
master <- data.frame()
n <- 3 # number of pages to scrape. 80 pages in total. I just scraped 5 pages for this example.
for(i in 1:n) {
start <- i
if (i == 1 ) {
i<-''
}
url_site<-(sprintf('https://www.aliexpress.com/category/1909/digital-camera/%s.html?isrefine=y&site=glo&pvId=351-350381&tag=', i))
cat('display results:',(start),'-',(start+29) ,'in page', start, 'now n',url_site,'n')
site <- url_site
# Sys.sleep(5)
remDr$navigate(site)
remDr$setImplicitWaitTimeout(10000000)
remDr$setTimeout(type = "page load", milliseconds = 10000000)
# Sys.sleep(5)
remDr$executeScript("my_interval = window.setInterval(function() {window.scrollBy(0, 300);}, 100);return;", args = list())
# Sys.sleep(5)
cat('a')
NamewebElems <- remDr$findElements(using = 'css selector', ".detail h3 a")
remDr$executeScript("clearInterval(my_interval);", args = list())
}
我注意到如果我删除这行
NamewebElems <- remDr$findElements(using = 'css selector', ".detail h3 a")
,问题就解决了,迭代可以正常工作。但问题是,我需要这一行,在第二次循环,错误弹出和脚本停止。
我发现了一个解决方案,显然不是最好的,但我测试了,它的工作。每次注入javascript代码时都使用try ()
。错误仍然会弹出,但此方法将防止迭代停止。同时,函数被成功触发。所以你的代码应该是这样的:try(remDr$executeScript("window.setInterval(function() {window.scrollBy(0, 300);}, 100)", args = list()))
这个工作和测试:
remDr <- remoteDriver(browserName = "chrome", nativeEvents=FALSE)
remDr$open()
url_site <-'https://www.aliexpress.com/category/1909/digital-camera.html?site=glo&pvId=351-350381&attrRel=or&isrefine=y'
remDr$navigate(url_site) # navigates to webpage
remDr$setImplicitWaitTimeout(10000000)
remDr$setTimeout(type = "page load", milliseconds = 10000000)
Sys.sleep(6)
# remDr$executeScript("document.getElementById('alibaba-login-box').getElementById('fm-login-id').value = 'tokenspy@gmail.com';alert();", args = list())
master <- data.frame()
n <- 3 # number of pages to scrape. 80 pages in total. I just scraped 5 pages for this example.
for(i in 1:n) {
start <- i
if (i == 1 ) {
i<-''
}
url_site<-(sprintf('https://www.aliexpress.com/category/1909/digital-camera/%s.html?isrefine=y&site=glo&pvId=351-350381&tag=', i))
cat('display results:',(start),'-',(start+29) ,'in page', start, 'now n',url_site,'n')
site <- url_site
# Sys.sleep(5)
remDr$navigate(site)
remDr$setImplicitWaitTimeout(10000000)
remDr$setTimeout(type = "page load", milliseconds = 10000000)
# Sys.sleep(5)
try(remDr$executeScript("my_interval = window.setInterval(function() {window.scrollBy(0, 300);}, 100);return;", args = list()))
# Sys.sleep(5)
cat('a')
NamewebElems <- remDr$findElements(using = 'css selector', ".detail h3 a")
try(remDr$executeScript("clearInterval(my_interval);", args = list()))
}
您可以使用catch
以更合适的方式捕获错误。显然,最好的解决方案是防止这种错误,但如果您的目标只是使脚本正常工作,那么这个答案可能会对您有所帮助。