XSLT 文件中的哪个 XPath 语句会将下面的 HTML 转换为下面的 XML



我想要以下表数据:

<html>
<table border="1">
<tr>
<td rowspan="2">2015</td>
<td>First Event of 2015</td>
</tr>
<tr><td>Second Event of 2015</td></tr>
<tr>
<td rowspan="2">2014</td>
<td>First Event of 2014</td>
</tr>
<tr><td>Second Event of 2014</td></tr>
</table>
</html>

以使用 XPath 转换为以下 XML:

<events>
<event year="2015" name="First Event of 2015">
<event year="2015" name="Second Event of 2015">
<event year="2014" name="First Event of 2014">
<event year="2014" name="Second Event of 2014">
</events>

如何处理 xpath 中的行跨度以获取此输出?

作为记录,我使用以下 Java 代码来执行 XSLT 转换:

String xsltCode = ... // the xslt Im asking for....
File xmlInput = ... // the file with the html code above
File xmlOutput = new File("output.xml");
Transformer transformer = TransformerFactory.newInstance().newTransformer(new StreamSource(new StringReader(xsltCode)));
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
Source xmlSource = new StreamSource(xmlInput);
Result resultOutput = new StreamResult(xmlOutput);
transformer.transform(xmlSource,resultOutput);

我很高兴我们终于解决了您的需求。请尽量从一开始就明确您未来的问题 - 这将节省您的时间和反对票。

编写第一个模板,该模板与/匹配并输出输出的最外层元素 events 。然后,编写第二个模板,该模板匹配所有没有 @rowspan 属性td元素。必须从具有@rowspan属性的第一个前面的td元素中选择有关年份的信息。

XSLT 样式表

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output method="xml" encoding="UTF-8" indent="yes" />
    <xsl:strip-space elements="*"/>
    <xsl:template match="/">
      <events>
          <xsl:apply-templates/>
      </events>
    </xsl:template>
    <xsl:template match="td[not(@rowspan)]">
        <event year="{preceding::td[@rowspan][1]}">
            <xsl:value-of select="."/>
        </event>
    </xsl:template>
    <xsl:template match="text()"/>
</xsl:transform>

XML 输出

<?xml version="1.0" encoding="UTF-8"?>
<events>
   <event year="2015">First Event of 2015</event>
   <event year="2015">Second Event of 2015</event>
   <event year="2014">First Event of 2014</event>
   <event year="2014">Second Event of 2014</event>
</events>

在此处在线试用此解决方案。

假设给定的示例过于简单,并且实际输入也可以包含只有一个事件的年份,我建议:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="table">
    <events>
        <xsl:apply-templates select="tr"/>
    </events>
</xsl:template>
<xsl:template match="tr">
    <event>
        <xsl:attribute name="year">
            <xsl:value-of select="(. | preceding-sibling::tr)[count(td)=2][last()]/td[1]"/>
        </xsl:attribute>
        <xsl:value-of select="td[last()]"/>
    </event>
</xsl:template>
</xsl:stylesheet>

应用于以下测试输入时:

<html>
  <table border="1">
    <tr>
      <td rowspan="2">2015</td>
      <td>First Event of 2015</td>
    </tr>
    <tr>
      <td>Second Event of 2015</td>
    </tr>
    <tr>
      <td rowspan="2">2014</td>
      <td>First Event of 2014</td>
    </tr>
    <tr>
      <td>Second Event of 2014</td>
    </tr>
    <tr>
      <td>Third Event of 2014</td>
    </tr>
    <tr>
      <td>2013</td>
      <td>Only Event of 2013</td>
    </tr>
  </table>
</html>

结果将是:

<?xml version="1.0" encoding="UTF-8"?>
<events>
   <event year="2015">First Event of 2015</event>
   <event year="2015">Second Event of 2015</event>
   <event year="2014">First Event of 2014</event>
   <event year="2014">Second Event of 2014</event>
   <event year="2014">Third Event of 2014</event>
   <event year="2013">Only Event of 2013</event>
</events>

最新更新