我想从html中的一些列表项中删除不需要的内容。基本上,我想剥离给定span之前的所有内容(使用class tab),但前提是该span之前的内容符合某些标准。
看下面的例子:
<ol class="ast">
<li>*<span class="tab"><!--tab--></span>Some blabla <img href="#">with a link.</a></li>
<li>**<span class="tab"><!--tab--></span>Some other blabla, this one without other elements</li>
</ol>
我想得到的是:
<ol class="ast">
<li>Some blabla <img href="#">with a link.</a></li>
<li>Some other blabla, this one without other elements</li>
</ol>
或者,用文字解释一下,如果我有一个列表项,以一个或多个星号开头,后跟一个制表符span,那么只保留span之后的内容。
我一直在闲逛,但没能找到满足我需求的东西,所以欢迎任何建议!
目前接受的解决方案是不正确的,通常会产生不正确的结果。例如,当应用于以下XML文件时:
<ol class="ast">
<li><a href="#">with a link.</a>*<span class="tab">Some blabla </span></li>
<li>Something else</li>
</ol>
产生错误的结果(span
和文本被错误地删除):
<?xml version="1.0" encoding="UTF-8"?><ol class="ast">
<li><a href="#">with a link.</a></li>
<li>Something else</li>
</ol>
这是一个正确的解决方案:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match=
"li/node()[1]
[self::text() and not(translate(.,'*',''))
and following-sibling::node()[self::span[@class='tab']]
]"/>
<xsl:template match=
"li/node()[2]
[self::span[@class='tab']
and preceding-sibling::node()[1]
[self::text() and not(translate(.,'*',''))]
]
"/>
</xsl:stylesheet>
当应用于提供的XML文档时:
<ol class="ast">
<li>*<span class="tab"><!--tab--></span>Some blabla <a href="#">with a link.</a></li>
<li>Not asterisks!<span class="tab"><!--tab--></span>Some other blabla, this one without other elements</li>
<li>**<span class="tab"><!--tab--></span>Some other blabla, this one without other elements</li>
<li>***<span>hello</span>Some other blabla, this one without other elements</li>
</ol>
这个转换产生想要的正确结果:
<ol class="ast">
<li>*<span class="tab"><!--tab--></span>Some blabla <a href="#">with a link.</a></li>
<li>Not asterisks!<span class="tab"><!--tab--></span>Some other blabla, this one without other elements</li>
<li>**<span class="tab"><!--tab--></span>Some other blabla, this one without other elements</li>
<li>***<span>hello</span>Some other blabla, this one without other elements</li>
</ol>
当应用于上面第一个XML文档时:
<ol class="ast">
<li><a href="#">with a link.</a>*<span class="tab">Some blabla </span>
</li>
<li>Something else</li>
</ol>
再次生成正确的结果:
<ol class="ast">
<li>
<a href="#">with a link.</a>*<span class="tab">Some blabla </span>
</li>
<li>Something else</li>
</ol>
如何:
<xsl:stylesheet
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"
exclude-result-prefixes="xs">
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="li/node()[1]
[self::text() and
matches(., '^*+$') and
following-sibling::node()[1]
[self::span and @class = 'tab']
]" />
<xsl:template match="li/node()[2]
[self::span and @class = 'tab']
[matches(preceding-sibling::text(), '^*+$')]" />
</xsl:stylesheet>
当在这个输入上运行时:
<ol class="ast">
<li>*<span class="tab"><!--tab--></span>Some blabla <a href="#">with a link.</a></li>
<li>Not asterisks!<span class="tab"><!--tab--></span>Some other blabla, this one without other elements</li>
<li>**<span class="tab"><!--tab--></span>Some other blabla, this one without other elements</li>
<li>***<span>hello</span>Some other blabla, this one without other elements</li>
<li><a href="#">with a link.</a>*<span class="tab">Some blabla </span></li>
</ol>
结果是:
<ol class="ast">
<li>Some blabla <a href="#">with a link.</a></li>
<li>Not asterisks!<span class="tab"/>Some other blabla, this one without other elements</li>
<li>Some other blabla, this one without other elements</li>
<li>***<span>hello</span>Some other blabla, this one without other elements</li>
<li><a href="#">with a link.</a>*<span class="tab">Some blabla </span></li>
</ol>