Selenium通过xpath获取文本



我正试图从网页中复制一个文本元素,并将其打印在控制台中,作为对未来项目的测试。

以下是我收到错误时的行:

elem = driver.find_element_by_xpath("/html/body/text()[2]")
print(elem.text)

错误显示:

C:UsershpDesktopfacebook-creator-studio-bot-masterget_cnp.py:12: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
driver.find_element_by_xpath("/html/body/form/input[2]").click()
C:UsershpDesktopfacebook-creator-studio-bot-masterget_cnp.py:13: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
elem = driver.find_element_by_xpath("/html/body/text()[2]")
Traceback (most recent call last):
File "C:UsershpDesktopfacebook-creator-studio-bot-masterget_cnp.py", line 13, in <module>
elem = driver.find_element_by_xpath("/html/body/text()[2]")
File "C:UsershpAppDataLocalProgramsPythonPython310libsite-packagesseleniumwebdriverremotewebdriver.py", line 521, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath)
File "C:UsershpAppDataLocalProgramsPythonPython310libsite-packagesseleniumwebdriverremotewebdriver.py", line 1248, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "C:UsershpAppDataLocalProgramsPythonPython310libsite-packagesseleniumwebdriverremotewebdriver.py", line 425, in execute
self.error_handler.check_response(response)
File "C:UsershpAppDataLocalProgramsPythonPython310libsite-packagesseleniumwebdriverremoteerrorhandler.py", line 247, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidSelectorException: Message: invalid selector: The result of the xpath expression "/html/body/text()[2]" is: [object Text]. It should be an element.
(Session info: chrome=99.0.4844.82)
Stacktrace:
Backtrace:
Ordinal0 [0x00769943+2595139]
Ordinal0 [0x006FC9F1+2148849]
Ordinal0 [0x005F4528+1066280]
Ordinal0 [0x005F6E04+1076740]
Ordinal0 [0x005F6CBE+1076414]
Ordinal0 [0x005F6F50+1077072]
Ordinal0 [0x00620D1E+1248542]
Ordinal0 [0x006211CB+1249739]
Ordinal0 [0x0064D812+1431570]
Ordinal0 [0x0063BA34+1358388]
Ordinal0 [0x0064BAF2+1424114]
Ordinal0 [0x0063B806+1357830]
Ordinal0 [0x00616086+1204358]
Ordinal0 [0x00616F96+1208214]
GetHandleVerifier [0x0090B232+1658114]
GetHandleVerifier [0x009C312C+2411516]
GetHandleVerifier [0x007FF261+560433]
GetHandleVerifier [0x007FE366+556598]
Ordinal0 [0x0070286B+2173035]
Ordinal0 [0x007075F8+2192888]
Ordinal0 [0x007076E5+2193125]
Ordinal0 [0x007111FC+2232828]
BaseThreadInitThunk [0x76CA6359+25]
RtlGetAppContainerNamedObjectPath [0x77827C24+228]
RtlGetAppContainerNamedObjectPath [0x77827BF4+180]

然后我的chomedrive关闭。应该是什么问题?

此错误消息。。。

selenium.common.exceptions.InvalidSelectorException: Message: invalid selector: The result of the xpath expression "/html/body/text()[2]" is: [object Text]. It should be an element.

意味着您使用的定位器策略是一个无效的选择器,如

driver.find_element_by_xpath("/html/body/text()[2]")

将返回第二个匹配的文本节点,其中asSelenium仅支持元素。


此用例

如果您的用例是从元素中检索文本,则需要在DOM树中唯一定位该元素,然后使用get_attribute()提取innerText,如下所示:

  • 使用css_selectorget_attribute("innerHTML"):

    print(driver.find_element(By.CSS_SELECTOR, "element_cssSelector").get_attribute("innerHTML"))
    
  • 使用xpathtext属性:

    print(driver.find_element(By.XPATH, "element_xpath").text)
    

tl;dr

使用Selenium 的文本和innerHTML之间的差异

正如错误跟踪中明确描述的,这里的问题是

invalid selector: The result of the xpath expression "/html/body/text()[2]" is: [object Text]. It should be an element.

使用print(elem.text),您试图将.text方法应用于elemweb元素对象,因此driver.find_element_by_xpath()需要网页上web元素的定位器,而"/html/body/text()[2]"不是web元素的有效XPath定位器
例如"`"html/body";may be a valid locator to a web element, while/text((refers to atext`某些web元素的属性值,但不属于web元素对象
UPD
您可以在这里获取web元素,提取它的文本,然后从中提取所需的文本部分,如下所示:

elem = driver.find_element_by_xpath("/body")
print(elem.text)

这将为您提供几个文本字符串,而不仅仅是生成的代码,不幸的是,我们无法更好地使用selenium,因为您要查找的文本位于body元素本身中
您可以将收到的文本拆分,从中提取代码。

相关内容

  • 没有找到相关文章

最新更新