将多个记录/元素分组以创建新结构



我搜索了一下,几乎找到了一个解决方案,但这需要样式表2.0,而我只能使用1.0。

这是我的示例XML:

<root>
<row>A1: Apples</row>
<row>B1: Red</row>
<row>C1: Reference text</row>
<row>badly formatted text which belongs to row above</row>
<row>and here.</row>
<row>D1: ABC</row>
<row>E1: 123</row>
<row>A1: Oranges</row>
<row>B1: Purple</row>
<row>C1: More References</row>
<row>with no identifier</row>
<row>again and here.</row>
<row>D1: DEF</row>
<row>E1: 456</row>
.
.

我希望它看起来像:

<root>
<row>
    <A1>Apples</A1>
    <B1>Red</B1>
    <C1>Reference text badly formatted text which belongs to row above and here.</C1>
    <D1>ABC</D1>
    <E1>123</E1>
</row>
<row>
    <A1>Oranges</A1>
    <B1>Purple</B1>
    <C1>More Reference with no identifier again and here.</C1>
    <D1>DEF</D1>
    <E1>456</E1>
</row>
.
.

有一种模式,我可以使用其他实用程序进行转换,但使用XSL1.0非常困难。

元素中有我可以使用的标题,当引用文本字段转换为XML时,它是多行的,它为每行创建自己的行,但它总是在C1和D1之间的相同位置。元素的实际名称并不重要。

排应该在E1之后结束。我认为我的例子很简单,但这种转变并非如此。我认为自己甚至不是XML/XSL的初学者。我从零开始学习,然后我被转移到其他项目,然后不得不重新开始。TIA。

更新:我遇到的另一个案例结构略有不同,但我希望结果相同:

<root>
  <row>
    <Field>A1: Apples</Field>
  </row>
<row>
    <Field>B1: Red</Field>
</row>
<row>
    <Field>C1: Reference text</Field>
</row>
<row>
    <Field>badly formatted text which belongs to row above</Field>
</row>
<row>
    <Field>and here.</Field>
</row>
<row>
    <Field>D1: ABC</Field>
</row>
<row>
    <Field>E1: 123</Field>
</row>
<row>
    <Field>A1: Oranges</Field>
</row>
<row>
    <Field>B1: Purple</Field>
</row>
<row>
    <Field>C1: More References</Field>
</row>
<row>
   <Field>with no identifier</Field>
</row>
<row>
   <Field>again and here.</Field>
</row>
<row>
   <Field>D1: DEF</Field>
</row>
<row>
   <Field>E1: 456</Field>
</row>

我尝试应用身份转换,但似乎不起作用:

<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>
<xsl:template match ="row/Field">
    <xsl:apply-templates/>
</xsl:template>

这看起来有点棘手,但我有一个似乎有效的解决方案。它允许在C1行之后有可变数量的行(不清楚是否总是2行)。

该解决方案大量使用following-sibling轴,这可能非常低效,尤其是对于大型输入文件。

你可以在这里测试一下。

<xsl:template match="/root">
    <!-- Loop through every "A1" row -->
    <xsl:for-each select="row[substring-before(text(), ':') = 'A1']">
        <!-- Add a <row> tag -->
            <xsl:element name="row">
                <!-- Add each of the A1-E1 tags by finding the first following-sibling that matches before the colon -->
                <xsl:apply-templates select="." />
                <xsl:apply-templates select="following-sibling::*[substring-before(text(), ':') = 'B1'][1]" />
                <xsl:apply-templates select="following-sibling::*[substring-before(text(), ':') = 'C1'][1]" />
                <xsl:apply-templates select="following-sibling::*[substring-before(text(), ':') = 'D1'][1]" />
                <xsl:apply-templates select="following-sibling::*[substring-before(text(), ':') = 'E1'][1]" />
            </xsl:element>
        </xsl:for-each>
    </xsl:template>
    <!-- Process each row -->
    <xsl:template match="/root/row">
        <!-- Create an element whose name is whatever is before the colon in the text -->
        <xsl:element name="{substring-before(text(), ':')}">
            <!-- Output everything after the colon -->
            <xsl:value-of select="normalize-space(substring-after(text(), ':'))" />
            <!-- Special treatment for the C1 node -->
            <xsl:if test="substring-before(text(), ':') = 'C1'">
                <!-- Count how many A1 nodes exist after this node -->
                <xsl:variable name="remainingA1nodes" select="count(following-sibling::*[substring-before(text(), ':') = 'A1'])" />
                <!-- Loop through all following-siblings that don't have a colon at position 3, and still have the same number of following A1 rows as this one does -->
                <xsl:for-each select="following-sibling::*[substring(text(), 3, 1) != ':'][count(following-sibling::*[substring-before(text(), ':') = 'A1']) = $remainingA1nodes]">
                    <xsl:text> </xsl:text>
                    <xsl:value-of select="." />
                </xsl:for-each>
            </xsl:if>
        </xsl:element>
    </xsl:template>

每个记录或组为7行。

那么为什么不简单地根据数字来做呢:

XSLT 1.0

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/root">
    <root>
        <xsl:for-each select="row[position() mod 7 = 1]">
            <row>
                <xsl:apply-templates select=". | following-sibling::row[position() &lt; 3] | following-sibling::row[4 &lt; position() and position() &lt; 7]"/>
            </row>
        </xsl:for-each>
    </root>
</xsl:template>
<xsl:template match="row">
    <xsl:element name="{substring-before(., ': ')}">
        <xsl:value-of select="substring-after(., ': ')"/>
    </xsl:element>
</xsl:template>
<xsl:template match="row[starts-with(., 'C1: ')]">
    <C1>
        <xsl:value-of select="substring-after(., 'C1: ')"/>
        <xsl:for-each select="following-sibling::row[position() &lt; 3]">
            <xsl:text> </xsl:text>
            <xsl:value-of select="."/>
        </xsl:for-each>
    </C1>
</xsl:template>
</xsl:stylesheet>

最新更新