lxml.xpath可以转换为<td/>" "吗?



我使用lxml来解析html字符串,例如:

<tr>
<td>111</td>   
<td>222</td>                                   
<td>20201208</td>                                 
<td></td>                                  
<td>26</td>                                   
<td>1431</td>                                 
<td></td>
</tr>

html.xpath的结果是

["111","222","20201208","26","1431"]

我的问题是我能得到像这样的结果吗

["111","222","20201208","","26","1431",""]

lxml中有什么选项可以实现吗

我使用以下代码来获取元素:

tds=tr.xpath(".//td/text()")

以下是如何使用ElementTree或lxml(其代码相同,只是导入不同(

import xml.etree.ElementTree as ET
from lxml import etree
xml = '''<tr>
<td>111</td>   
<td>222</td>                                   
<td>20201208</td>                                 
<td></td>                                  
<td>26</td>                                   
<td>1431</td>                                 
<td></td>
</tr>'''
root1 = ET.fromstring(xml)
data = [td.text if td.text else '' for td in root1.findall('.//td')]
print(data)
root2 = etree.fromstring(xml)
data = [td.text if td.text else '' for td in root2.findall('.//td')]
print(data)

输出

['111', '222', '20201208', '', '26', '1431', '']
['111', '222', '20201208', '', '26', '1431', '']

最新更新