硒无头铬运行速度慢得多



我有一个Selenium解析器类:

class DynamicParser(Parser):
"""Selenium Parser with processing JS"""
driver: Chrome = None
def __init__(self, driver_path='./chromedriver', headless=True):
chrome_options = Options()
if headless:
chrome_options.add_argument("--headless")
chrome_options.add_argument("window-size=1920,1080")
# bypass OS security
chrome_options.add_argument('--no-sandbox')
# overcome limited resources
chrome_options.add_argument('--disable-dev-shm-usage')
# don't tell chrome that it is automated
chrome_options.add_experimental_option(
"excludeSwitches", ["enable-automation"])
chrome_options.add_experimental_option('useAutomationExtension', False)
# disable images
prefs = {"profile.managed_default_content_settings.images": 2}
chrome_options.add_experimental_option("prefs", prefs)
# Setting Capabilities
capabilities = DesiredCapabilities.CHROME.copy()
capabilities['acceptSslCerts'] = True
capabilities['acceptInsecureCerts'] = True
self.driver = Chrome(chrome_options=chrome_options,
executable_path=driver_path, desired_capabilities=capabilities)
def goto(self, url: str):
"""Goes to specified URL"""
self.driver.get(url)
def get_seller_name(self) -> str:
"""Returns seller's name"""
offer_actions_tag = self.driver.find_element_by_class_name(
'offer-user__actions')
profile_link_tag = offer_actions_tag.find_element_by_tag_name('a')
return profile_link_tag.text.strip()

我还有一个测试脚本,它创建DynamicParser,转到某个页面并调用.get_seller_name()

我注意到,当我无头运行Chromedriver时,它运行得慢得多,所以我用time python3 test.py测试了它。

无头铬输出:

python3 test.py  2,98s user 0,94s system 3% cpu 2:04,65 total

非无头铬输出:

python3 test.py  1,48s user 0,33s system 47% cpu 3,790 total

正如我们所看到的,无头铬的运行速度几乎慢了33倍!

Chrome版本:83.0.4103.116

Chromedriver版本:83.0.4103.39

我真的不明白问题出在哪里。当我开发我以前的应用程序时,headless chrome运行得足够快。

刚刚发现问题。是

chrome_options.add_argument('--disable-dev-shm-usage')

我认为它应该有不受限制的chrome资源,但在这种情况下它肯定不起作用。

运行headless驱动程序时,您还可以使用这些设置来提高性能。

browser_options = webdriver.ChromeOptions()
browser_options.headless = True
image_preferences = {"profile.managed_default_content_settings.images": 2}
browser_options.add_experimental_option("prefs", image_preferences)

我发现这些都不适合我。

然而,将options.add_argument('--headless')更改为options.add_argument('--headless=new')产生了巨大的差异,似乎已经完全解决了这个问题。

最新更新