SOLR 停用词:带有 'of' 的词不给出任何结果,但当排除 of 时,我们得到正确的结果



谁能解释一下SOLR中的停顿词是如何工作的?在stopword.txt中,我定义了of。在schema.xml中,我有

<filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords.txt"enablePositionIncrements="true"/>

现在当我搜索任何包含单词of的东西时,结果中没有显示。

示例: oil of olay显示没有结果,而oil olay显示正确的结果。

更多的文件定义:

        <analyzer type="index">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/> 
            <filter class="solr.StopFilterFactory"
                    ignoreCase="true"
                    words="stopwords.txt"
                    enablePositionIncrements="true"
                    />
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.WordDelimiterFilterFactory"
                    generateWordParts="1"
                    generateNumberParts="1"
                    catenateWords="1"
                    catenateNumbers="1"
                    catenateAll="1"
                    preserveOriginal="1"
                    splitOnCaseChange="0"
                    splitOnNumerics="0"
                    types="wdtypes.txt"
                    />
            <filter class="solr.KeywordRepeatFilterFactory"/>
            <filter class="solr.EnglishMinimalStemFilterFactory"/>
            <filter class="solr.TrimFilterFactory" updateOffsets="false"/>
            <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.StopFilterFactory"
                    ignoreCase="true"
                    words="stopwords.txt"
                    enablePositionIncrements="true"
                    />
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.WordDelimiterFilterFactory"
                    generateWordParts="1"
                    generateNumberParts="1"
                    catenateWords="1"
                    catenateNumbers="1"
                    catenateAll="1"
                    preserveOriginal="1"
                    splitOnCaseChange="0"
                    splitOnNumerics="0"
                    types="wdtypes.txt"
                    />
            <filter class="solr.EnglishMinimalStemFilterFactory"/>
            <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
        </analyzer>

当调试:+ (upclist:奶油+ + wheat& qt = productresults&行= 10,fq = % 3 aactive&amp地位;fq = facilitystatus % 3 aactive& fq = facilityid % 3 a100& fq = inventoryctrlcode % % 5 b0 + 3 + 100% 5 d& fq 3 = weblifecycle % % 283 +或者+ 4% 29,fq = groupnumber % 3 a2 ^ 1.2 |关键词:奶油+ + wheat& qt = productresults&行= 10,fq = % 3 aactive&amp地位;fq = facilitystatus % 3 aactive& fq = facilityid % 3 a100& fq = inventoryctrlcode % % 5 b0 + 3 + 100% 5 d& fq 3 = weblifecycle % % 283 +或者+ 4% 29,fq = groupnumber % 3 a2 ^ 20.0 |product_elevate:奶油+ + wheat& qt = productresults&行= 10,fq = % 3 aactive&amp地位;fq = facilitystatus % 3 aactive& fq = facilityid % 3 a100& fq = inventoryctrlcode % % 5 b0 + 3 + 100% 5 d& fq 3 = weblifecycle % % 283 +或者+ 4% 29,fq = groupnumber % 3 a2 ^ 5.0 |面积:"(奶油+ + wheat& qt = productresults&行= 10,fq = % 3 aactive&amp地位;fq = facilitystatus % 3 aactive& fq = facilityid % 3 a100& fq = inventoryctrlcode % % 5 b0 + 3 + 100% 5 d& fq 3 = weblifecycle % % 283 +或者+ 4% 29,fq = groupnumber % 3 a2奶油)的小麦Qt productresultsrow(行creamofwheatqtproductresultsrow) 10 fqstatus%3aactivefqfacilitystatus%3aactivefqfacilityid%3a100fqinventoryctrlcode%3a%5b0(到fqstatus%3aactivefqfacilitystatus%3aactivefqfacilityid%3a100fqinventoryctrlcode%3a%5b0到)100%5d fqweblifecycle%3a%283(或fqweblifecycle%3a%283or) 4%29 fq (groupnumber%3a2 fqgroupnumber%3a2)creamofwheatqtproductresultsrows10fqstatus % 3 3 aactivefqfacilitystatus % 3 aactivefqfacilityid % 3 a100fqinventoryctrlcode % % 5 b0to100 % 5 dfqweblifecycle % % 283 4 % 29 fqgroupnumber % 3 a2)"2.5 ~ 3 ^ | productid:奶油+ + wheat& qt = productresults&行= 10,fq = % 3 aactive&amp地位;fq = facilitystatus % 3 aactive& fq = facilityid % 3 a100& fq = inventoryctrlcode % % 5 b0 + 3 + 100% 5 d& fq 3 = weblifecycle % % 283 +或者+ 4% 29,fq = groupnumber % 3 a2 ^ 1.7 |productname:奶油+ + wheat& qt = productresults&行= 10,fq = % 3 aactive&amp地位;fq = facilitystatus % 3 aactive& fq = facilityid % 3 a100& fq = inventoryctrlcode % % 5 b0 + 3 + 100% 5 d& fq 3 = weblifecycle % % 283 +或者+ 4% 29,fq = groupnumber % 3 10.0 a2 ^) ~ 0.01 ()

这可能不相关,因为你说你只搜索一个字段(我张贴它,因为你说你正在使用edismax和qf)。当我想要提升精确搜索时,我也遇到了类似的问题,所以我将qf制作成这样:<str name="qf">title^45 title_str^55。标题字段使用了停止词,而title_str显然没有。这里描述了它经常找不到使用停词的搜索的原因。他们的解决方案是摆弄mm值。在我的例子中,有效的解决方案是将title_str放在pf标记中(并将其从qf标记中删除),因此确切的查找结果将出现在顶部。

最终解决了这个问题:

"mm" from 2<-25% To 2<-36%

相关内容

最新更新