selenium.common.exceptions.webdriverexception:消息:typeError:p



我正在尝试开发一个网络刮擦工具。我有一个Python脚本和JavaScript代码。Python脚本调用JavaScript代码。我的JavaScript代码从网页中重述了相关内容。并将此内容返回到Python脚本。当我们在浏览器上手动运行它时,JavaScript代码正常运行。这是我的JS代码:

var doc = ""
var path1 = document.getElementsByClassName("entry-header")[0]
doc = doc + path1.innerText
doc = doc + "n"
var path2 = document.getElementsByClassName("entry-content")[0]
var cont = path2.getElementsByTagName("p")
for (var i=0; i<cont.length; i++)
{
   doc = doc+cont[i].innerText
   doc = doc+ "n"
}
res()
function res()
{
  return doc
}

这是我的python代码:

from selenium import webdriver
js = open("generalized.js", "r").read()
driver = webdriver.Firefox()
browser = webdriver.Firefox()
browser.get("http://www.geeksforgeeks.org/branch-and-bound-set-1-       introduction-with-01-knapsack/")
result = driver.execute_script(js)
print result

,但是当通过python调用时,我给了我以下错误。

Traceback (most recent call last):
File "sample.py", line 7, in <module>
result = driver.execute_script(js)
File "/home/sagar/anaconda2/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 543, in execute_script
'args': converted_args})['value']
File "/home/sagar/anaconda2/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 308, in execute
self.error_handler.check_response(response)
File "/home/sagar/anaconda2/lib/python2.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: TypeError: p[0] is undefined

请帮助我解决这个问题。还是其他方法可以进行网络刮擦?

您是由于某种原因,您是启动两个浏览器,但是在浏览器中执行脚本并打开一个空页面。这对我有用:

from selenium import webdriver
import time
js = open("generalized.js", "r").read()
browser = webdriver.Firefox()
browser.get("http://www.geeksforgeeks.org/branch-and-bound-set-1-introduction-with-01-knapsack/")
time.sleep(1)  # try to replace with an Explicit Wait
result = browser.execute_script(js)
print(result)

带有最高级别return doc的修改脚本:

var doc = "";
var path1 = document.getElementsByClassName("entry-header")[0];
doc = doc + path1.innerText;
doc = doc + "n";
var path2 = document.getElementsByClassName("entry-content")[0];
var cont = path2.getElementsByTagName("p");
for (var i=0; i<cont.length; i++)
{
   doc = doc+cont[i].innerText;
   doc = doc+ "n"
}
return doc;

相关内容

最新更新