我正试图使用PhantomJS来编写一个scraper,但即使是morph.io文档中的示例也不起作用。我想问题是"https",我用http测试了它,它正在工作。你能给我一个解决方案吗?我用firefox测试了它,它有效。
from splinter import Browser
with Browser("phantomjs") as browser:
# Optional, but make sure large enough that responsive pages don't
# hide elements on you...
browser.driver.set_window_size(1280, 1024)
# Open the page you want...
browser.visit("https://morph.io")
# submit the search form...
browser.fill("q", "parliament")
button = browser.find_by_css("button[type='submit']")
button.click()
# Scrape the data you like...
links = browser.find_by_css(".search-results .list-group-item")
for link in links:
print link['href']
PhantomJS无法处理https URL?
Splinter在后台使用Python的Selenium WebDriver绑定(例如),因此您可以简单地传递如下所示的必要选项:
with Browser("phantomjs", service_args=['--ignore-ssl-errors=true', '--ssl-protocol=any']) as browser:
...
请参阅PhantomJS未能打开HTTPS站点,了解为什么这些选项可能是必要的。查看PhantomJS命令行界面以了解更多选项。