我的源 HTML:
<!DOCTYPE html>
<html class="no-js" lang="en-GB" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-GB">
<head>
<meta name="generator" content="HTML Tidy for HTML5 for Linux version 5.2.0" />
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Test</title>
</head>
<body>
<div class="lay-nav-primary">
<ul class="TabMenu">
<li>
<a href="http://example.com/">I am not wanted but am not removed.</a>
</li>
</ul>
</div>
<div class="lay-library--header">
I am not wanted and am removed.
</div>
<p>I am not wanted but am not removed.</p>
</body>
</html>
我的 XSLT 样式表:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" omit-xml-declaration="yes"/>
<xsl:strip-space elements="*"/>
<!-- Identity transform -->
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<!-- Remove unwanted elements -->
<!-- successfully removes node with the given class -->
<xsl:template match="//*[contains(concat(' ', normalize-space(@class), ' '), ' lay-library--header ')]"/>
<!-- fails to remove 'ul' child node of node with the given class -->
<xsl:template match="//*[contains(concat(' ', normalize-space(@class), ' '), ' lay-nav-primary ')]/ul"/>
<!-- fails to remove 'p' nodes -->
<xsl:template match="p | p/* | //p | //p/*"/>
<!-- fails to remove 'p' nodes -->
<xsl:template match="p | p/* | //p | //p/*" priority="9"/>
</xsl:stylesheet>
我不明白为什么最后三个模板在第一个模板时没有像我预期的那样工作。谢谢。
您的 HTML/XML 位于默认命名空间http://www.w3.org/1999/xhtml
中。将其绑定到前缀并在XPath中使用它。
此外,无需在模板匹配中使用//
。
例。。。
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:x="http://www.w3.org/1999/xhtml">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" omit-xml-declaration="yes"/>
<xsl:strip-space elements="*"/>
<!-- Identity transform -->
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<!-- Remove unwanted elements -->
<!-- successfully removes node with the given class -->
<xsl:template match="*[contains(concat(' ', normalize-space(@class), ' '), ' lay-library--header ')]"/>
<!-- successfully removes 'x:ul' child node of node with the given class -->
<xsl:template match="*[contains(concat(' ', normalize-space(@class), ' '), ' lay-nav-primary ')]/x:ul"/>
<!--successfully removes x:p nodes-->
<xsl:template match="x:p"/>
</xsl:stylesheet>