Xpath Xmlx (输出错误) ['RM\u202f649']


<div class="product-info ncss-col-sm-12 full" xpath="1"><h1 class="headline-5 pb3-sm">Air More Uptempo</h1><h5 class="headline-1 pb3-sm">Black</h5><div class="headline-5 pb6-sm fs14-sm fs16-md">RM 649</div><div class="test-available-date"><div class="available-date-component">Available 20/11 at 10:00 am</div></div><div class="description-text text-color-grey"><p>More than perhaps any other silhouette, the Air More Uptempo encapsulates '90s basketball footwear at its finest. Big and bold, the design unapologetically represents a hybrid of style and innovation that made major waves upon its debut—and still turns heads over 20 years later. This OG colourway sees the style covered in neutral black with crisp highlights of contrasting white.</p></div></div>

源代码可以从上面读取。

代码

price = tree.xpath("//div[contains(@class, 'product-info')]/div[1]/text()")
print(price)

输出

['RMu202f649']

您的输入XML包含RM和649之间的不可见字符(\u202f(。这就是为什么你会得到这样的输出。

请尝试以下XPath 2.0:

//div[contains(@class, 'product-info')]/div[1]/text() cast as xs:token?

最新更新