如何使用 Splinter 获取question
的第一部分、下划线和最后一部分的文本并将其存储到变量中?
请参阅底部的 HTML。我想使以下变量具有以下值:
first_part = "Jingle bells, jingle bells, jingle all the"
second_part = "_______"
third_part = "! Oh what fun it is to ride in one-horse open sleigh!"
我去了这里,使用了 XPATHs
//*[@id="question_container"]/div[1]/span/text()[1] #this is first_part
//*[@id="question_container"]/div[1]/span/span #this is second_part
//*[@id="question_container"]/div[1]/span/text()[2] #this is third_part
并将它们应用于下面的 HTML。他们在测试中返回了想要的值,但对于我的程序,Splinter 似乎拒绝了它们:
first_part = browser.find_by_xpath(xpath = '//*[@id="question_container"]/div[1]/span/text()[1]').text
second_part = browser.find_by_xpath(xpath = '//*[@id="question_container"]/div[1]/span/span').text
third_part = browser.find_by_xpath(xpath = '//*[@id="question_container"]/div[1]/span/text()[2]').text
print first_part
print second_part
print third_part
-------------- OUTPUT -------------
[]
[]
[]
我做错了什么,为什么错了,我应该如何更改我的代码?
引用的 HTML(稍微编辑为"铃儿响叮当"以更好地传达问题(是使用 Splinter 的browser.html
功能检索的:
<div id="question_container" style="display: block;">
<div class="question_wrap">
<span class="question">Jingle bells, jingle bells, jingle all the
<span class="underline" style="display: none;">_______</span>
<input type="text" name="vocab_answer" class="answer" id="vocab_answer"></input>
! Oh what fun it is to ride in one-horse open sleigh!</span>
</div></div>
传递给
find_by_xpath()
方法的xpath
必须指向/结果到元素,而不是文本节点。
一种选择是找到外span
,获取它的html
并将其提供给lxml.html
:
from lxml.html import fromstring
element = browser.find_by_xpath(xpath='//div[@id="question_container"]//span[@class="question"]')
root = fromstring(element.html)
first_part = root.xpath('./text()[1]')[0]
second_part = root.xpath('./span/text()')[0]
third_part = root.xpath('./text()[last()]')[0]
print first_part, second_part, third_part
指纹:
Jingle bells, jingle bells, jingle all the
_______
! Oh what fun it is to ride in one-horse open sleigh!