哦不，又不是"XSLT 1.0 finding duplicates"任务。但我真的是认真的

这是我不起眼的XML文件：

<choice>
    <question>
        <text>one</text>
        <answer>
            <text>2</text>
        </answer>
        <answer>
            <text>2</text>
        </answer>
    </question>
    <question>
        <text>two</text>
        <answer>
            <text>d</text>
        </answer>
    </question>
    <question>
        <text>three</text>
        <answer>
            <text>1</text>
        </answer>
        <answer>
            <text>2</text>
        </answer>
    </question>
</choice>

这就是我试图找出"问题"中是否有重复文本的内容：

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    version="1.0">
    <xsl:template match="/choice">
        <xsl:variable name="ok" select="count(question/text)=count(question/text[not(.=following::text)])"/>
        <xsl:copy-of select="$ok"/>
        <xsl:if test="not($ok)">
            <xsl:message terminate="yes">
                Error: Duplicate Question
            </xsl:message>
        </xsl:if>
    </xsl:template>
</xsl:stylesheet>

工作正常 - 但是我如何确定答案部分中是否有重复项（此示例中的问题一 - 重复的"2"）？

很抱歉打扰了，但我真的被困在这里了...

我将您的测试用例扩展了一个question，从而产生了以下XML

<choice>
    <question>
        <text>one</text>
        <answer>
            <text>2</text>
        </answer>
        <answer>
            <text>2</text>
        </answer>
    </question>
    <question>
        <text>two</text>
        <answer>
            <text>d</text>
        </answer>
    </question>
    <question>
        <text>three</text>
        <answer>
            <text>1</text>
        </answer>
        <answer>
            <text>2</text>
        </answer>
    </question>
    <question>
        <text>three</text>
        <answer>
            <text>1</text>
        </answer>
        <answer>
            <text>d</text>
        </answer>
    </question>
</choice>

以下 XSLT 隔离所有重复项。 <for-each> 对于跟踪preceding-sibling s是必要的（我不得不将位置编号减半仅用于输出（这对于功能不是必需的））：

  <xsl:template match="/choice/question"> 
    <xsl:variable name="quesPos" select="position() div 2" />
    <xsl:for-each select="answer">
      <xsl:variable name="txt" select="text/text()" />
      <xsl:variable name="answPos" select="position()" />
      <xsl:for-each select="../preceding-sibling::*/answer">
        <xsl:if test="text/text() = $txt">
          <dup>
            <xsl:value-of select="concat('question[',$quesPos,']/answer[',$answPos,'] = ',$txt,' is a duplicate')" />
          </dup><xsl:text>&#10;</xsl:text>
        </xsl:if>
      </xsl:for-each> 
    </xsl:for-each> 
  </xsl:template>

此模板的结果是

<?xml version="1.0"?>
<dup>question[3]/answer[2] = 2 is a duplicate</dup>
<dup>question[3]/answer[2] = 2 is a duplicate</dup>
<dup>question[4]/answer[1] = 1 is a duplicate</dup>
<dup>question[4]/answer[2] = d is a duplicate</dup>

替换<xsl:if>中的部分可让您选择执行任何您喜欢的操作。

因此，仅对于原始XSLT隔离重复项，请删除除txt和xsl:if中的所有变量。

第二种方法 - 可能会更快 - 是使用 xsl:key 进行索引（但没有position()信息 - 如果需要，请将谓词移出for-each）。这称为Muenchian分组。

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="1.0">
  <xsl:key name="answers" match="answer" use="text/text()"/>
  <xsl:template match="/choice"> 
    <xsl:for-each select="question/answer[not(generate-id() = generate-id(key('answers',text/text())[1]))]">
      <dup>
        <xsl:value-of select="concat(name(),'[',generate-id(),'] = ',text/text(),' is a duplicate')" />
      </dup><xsl:text>&#10;</xsl:text>     
    </xsl:for-each>
  </xsl:template> 
</xsl:stylesheet>

相关内容

最新更新

热门标签：