Python Print XPath元素给出空数组



我正在尝试获取网站中元素的XPath https://www.tradingview.com/symbols/btcusd/technicals/特别是在摘要速度计下的结果。无论是买还是出售。

速度计

使用Google Chrome XPath我获得结果

//*[@id="technicals-root"]/div/div/div[2]/div[2]/span[2]

并尝试在Python中获取这些数据,我将其插入

from lxml import html
import requests
page = requests.get('https://www.tradingview.com/symbols/BTCUSD/technicals/')
tree = html.fromstring(page.content)
status = tree.xpath('//*[@id="technicals-root"]/div/div/div[2]/div[2]/span[2]/text()')

当我打印状态时,我会得到一个空数组。但是XPath似乎没有什么问题。我读到Google用错误编写的HTML表进行了一些恶作剧,这将输出错误的XPath,但这似乎不是问题。

当我运行您的代码时," Technologys-root"div为空。我认为JavaScript正在填写它。当您无法静态地获取页面时,您可以随时转向Selenium运行浏览器并让它弄清楚所有内容。您可能必须调整驱动程序路径以使其在您的环境中工作,但这对我有用:

import time
import contextlib
import selenium
from selenium import webdriver 
from selenium.webdriver.common.by import By 
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 
from selenium.common.exceptions import TimeoutException
option = webdriver.ChromeOptions()
option.add_argument(" — incognito")
with contextlib.closing(webdriver.Chrome(
        executable_path='/usr/lib/chromium-browser/chromedriver', 
        chrome_options=option)) as browser:
    browser.get('https://www.tradingview.com/symbols/BTCUSD/technicals/')
    # wait until js has filled in the element - and a bit longer for js churn
    WebDriverWait(browser, 20).until(EC.visibility_of_element_located(
        (By.XPATH, 
        '//*[@id="technicals-root"]/div/div/div[2]/div[2]/span')))
    time.sleep(1)
    status = browser.find_elements_by_xpath(
        '//*[@id="technicals-root"]/div/div/div[2]/div[2]/span[2]')
    print(status[0].text)

最新更新