小贝子编程

scrapy-xpath在元素被分配后提取文本

我有这样的html

<h1 id="1"><i>2</i>sample contents</h1>

我知道通过使用以下工作来获得只有文本完美没有html

response.xpath('//*[@id="1"]/text()').get()  #  sample contents
response.xpath('//*[@id="1"]/text()').extract_first()  #  sample contents

但是如果我分配给一个变量，那么只想得到后面没有html的文本？

例如

header = response.xpath('//*[@id="1"]')
# the below will get text WITH html tags
header.get()
header.extract_first()

我想要的是，如果我被分配给header，并且我只想获得文本，我如何才能做到这一点？

提前感谢您的任何建议和帮助。

编辑：

通过测试Moein的答案，不知何故，我得到的回报是"rn rn "间距，而不是

您可以通过调用header变量上的xpath来继续您的XPath地址：

header.xpath('./text()').get()

相关内容