我想为刮擦价格做解析,但是我找不到解析Innerhtml的工作方法
我不知道为什么,但是selenium(getAttribute(innerhtml((,phantomjs(page.evaluation函数(({return document.elementtoparse.innerhtml}(和scrapy-splash(使用WebPageEngine和parse html加载网页(不工作。一直以来,结果都是空的" [],null或webelement
我在Banggood的产品和着陆页上测试我的代码,但结果始终相同。
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get("https://www.banggood.com/BlitzWolf-Ampcore-Turbo-TC10-3A-Durable-USB-Type-C-Charging-Data-Cable-p-1188424.html?rmmds=category&cur_warehouse=CN") #random url
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, "item_now_price"))
)
finally:
driver.quit()
print(element)
和输出:
<selenium.webdriver.firefox.webelement.FirefoxWebElement (session="b0593791-138b-4177-a8f3-e7983143824a", element="d08f4717-d3f1-4594-8f2b-1bf943deb9f9")>
当需要时:
6.59(or US$6.59)
我也尝试了
price = driver.find_element_by_class_name('item_now_price').getAttribute("innerHTML")
和
var page = require('webpage').create();
page.open('https://www.banggood.com/BlitzWolf-Ampcore-Turbo-TC10-3A- Durable-USB-Type-C-Charging-Data-Cable-p-1188424.html?rmmds=category&cur_warehouse=CN', function(status) {
var price = page.evaluate(function() {
return document.getElementByClassName('item_now_price').innerHTML;
});
console.log('price is ' + price);
phantom.exit();
});
,但结果为null,当我添加
时page.includeJs(/url/to/js)
终端停止工作
s
一旦在硒中获得元素后,您就可以使用.text
请参阅下面的第一个示例的轻微调整:
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, "item_now_price"))
)
print(element.text)
finally:
看看是否得到您要寻找的结果。
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
driver.get("https://www.banggood.com/BlitzWolf-Ampcore-Turbo-TC10-3A-Durable-USB-Type-C-Charging-Data-Cable-p-1188424.html?rmmds=category&cur_warehouse=CN") #random url
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, "item_now_price"))
).text
finally:
driver.quit()
print(element)