XSLT:如何在复制时删除结果结果树片段的元素



我的目标是提取肥皂体的内容。ElementsToExtract节点 - 但是节点名称基本上可以任意:

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
  <soap:Header>
    <MessageId>52DF2371-4094-4408-A3EA-42D73FD1B7A3</MessageId>
  </soap:Header>
  <soap:Body>
    <ElementsToExtract>
        ...
        <RemoveMe>...</RemoveMe>
        <RemoveMeAlso>...</RemoveMeAlso>
        ...
    </ElementsToExtract>
  </soap:Body>
</soap:Envelope>

当我提取内容时,我想摆脱我所有源文档具有共同点的两个元素 - 例如RemoveMeRemoveMeAlso。由于有可能称为更深的嵌套节点相同,因此必须仅从ElementsToExtract节点下方的层中剥离它们。我将如何制定该表达式?

这是我现在所做的:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
                xmlns:exsl="http://exslt.org/common"
                exclude-result-prefixes="soap exsl">
  <xsl:output method="xml" indent="yes" omit-xml-declaration="no"/>
  <xsl:strip-space elements="*"/>
  <xsl:variable name="SoapHeaderContents" select="exsl:node-set(soap:Envelope/soap:Header/*)"/>
  <xsl:variable name="SoapBodyContents" select="exsl:node-set(soap:Envelope/soap:Body/*)"/>
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
  <xsl:template match="/">
    <xsl:apply-templates select="$SoapBodyContents"/>
  </xsl:template>
  <!-- This is global, how to restrict to the ElementsToExtract element? -->
  <xsl:template match="node()[name() = 'RemoveMe']"/>
  <xsl:template match="node()[name() = 'RemoveMeAlso']"/>
</xsl:stylesheet>

我还使用了node-set()函数,读到一个人无法修改结果树片段(它们只是文本节点?(,但是我不太了解如何解决该集合的结果节点。因此,节点未删除:

<xsl:template match="/">
  <xsl:apply-templates select="$SoapBodyContents"/>
  <xsl:apply-templates select="$SoapBodyContents/RemoveMe" mode="m1"/>
</xsl:template>
<xsl:template name="StripRemoveMe" match="RemoveMe" mode="m1"/>

我还阅读了规范的某些部分,但无济于事。我迷失了线索。有人可以将我带到正确的方法吗?

这项工作适合您:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
exclude-result-prefixes="soap">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>
<!-- skip soap wrappers -->
<xsl:template match="/soap:Envelope">
    <xsl:apply-templates select="soap:Body/ElementsToExtract"/>
</xsl:template>
<!-- remove unwanted elements -->
<xsl:template match="ElementsToExtract/RemoveMe | ElementsToExtract/RemoveMeAlso"/>
</xsl:stylesheet>

在(不太可能的(情况下,您不知道ElementsToExtract元素的名称,您可以使用:

<!-- skip soap wrappers -->
<xsl:template match="/soap:Envelope">
    <xsl:apply-templates select="soap:Body/*"/>
</xsl:template>
<!-- remove unwanted elements -->
<xsl:template match="soap:Body/*/RemoveMe | soap:Body/*/RemoveMeAlso"/>

一些快速的想法。

  • 您创建用于存储肥皂和车身的变量。这些已经在输入文档中,因此仅编写与这些匹配的模板更有意义。

  • 尽管您为SOAP标头创建一个变量,但您永远不会在任何地方使用它。

  • 如果您尝试连续应用模板,例如在示例XSL代码中,您将从第一个apply-templates中获取所有输出节点,然后从下一个apply-templates中获取所有输出节点。如果这些节点是用任何方式交织的,则此方法将不会产生可行的输出。

这是您的示例输入XML的修订版,添加了我们要保留的几个元素。

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
  <soap:Header>
    <MessageId>52DF2371-4094-4408-A3EA-42D73FD1B7A3</MessageId>
  </soap:Header>
  <soap:Body>
    <ElementsToExtract>
        <KeepMe>This text will persist in the output.</KeepMe>
        <RemoveMe>This is text that will be removed.</RemoveMe>
        <RemoveMeAlso>This will also vanish from the output.</RemoveMeAlso>
        <OtherElementToKeep>And this one will also be kept.</OtherElementToKeep>
    </ElementsToExtract>
  </soap:Body>
</soap:Envelope>

这是我们想要的输出:

<?xml version="1.0" encoding="utf-8"?>
<ElementsToExtract>
    <KeepMe>This text will persist in the output.</KeepMe>
    <OtherElementToKeep>And this one will also be kept.</OtherElementToKeep>
</ElementsToExtract>

此XSL 1.0代码将完成这项工作。我从您的帖子中猜测您对XSL处理流不熟悉,因此我添加了评论来帮助解释发生了什么。

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
    version="1.0"
    exclude-result-prefixes="soap">
    <xsl:strip-space elements="*"/>
    <xsl:output method="xml" indent="yes"/>
    <!-- The `/` matches the _logical root_ of the input file.  This is 
        basically equivalent to the start of the file, NOT the first element.
        This is a common place to start processing in XSL. -->
    <xsl:template match="/">
        <!-- We just apply templates.  In your case, we know already that
            we DON'T want to process everything: we want to leave certain
            things out, including a lot of the outermost elements.  So
            we specify what to target in the `select` statement. -->
        <xsl:apply-templates select="soap:Envelope/soap:Body/ElementsToExtract"/>
    </xsl:template>
    <!-- This is the "identity" template, so called because it 
        just copies over applicable matches identically. 
        A template with a more-specific match statement takes
        precedence. -->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
    <!-- Here, we specify exactly those elements that are in the 
        processing flow, and that we want to exclude from the
        output.  Since `soap:Header` etc. are NOT in the processing
        flow (their element trees were never included in a preceding 
        call to `apply-templates`), we don't need to worry about those. -->
    <xsl:template match="RemoveMe | RemoveMeAlso"/>
</xsl:stylesheet>

请注意,输出中的最外部元素是ElementsToExtract。该元素将包括xmlns:soap="http://www.w3.org/2003/05/soap-envelope"名称空间声明,即使在任何输出元素中都没有使用此命名空间(至少对于此小样本输入XML(。

如果您可以使用XSL 2.0 ,并且要从输出中删除此名称空间,则可以将copy-namespaces="no"属性添加到<xsl:copy>元素。

最新更新