我正在尝试使用Elasticsearch
建立一个文本搜索,这是我第一次使用它,所以,我可能会误解许多概念。
当我写任何索引字段中存在的完整单词时,搜索工作正常,但是,我要做的是,例如,当我输入sam
获得samsung
的产品时,我正在使用标记器,它在许多s
sa
sam
sams
等中打破了术语。注意:我使用mongoosastic
与Elasticsearch
服务器一起工作。这是产品模型,我称之为Item
:
var ItemSchema = new mongoose.Schema({
title: {type: String, es_indexed:true, es_analyzer: 'edge_nGram_analyzer'},
price: Number,
description: {type: String, es_indexed:true},
picture: String,
vendor: {type: String, es_indexed:true},
vendorId: {type:String, es_indexed:true}
});
这里是模型代码的其余部分,我试图使用analyzer
和tokenizer
:
ItemSchema.plugin(mongoosastic, {
hosts: [
'localhost:9200'
]
});
var Item = mongoose.model('Item', ItemSchema);
Item.createMapping({
"analysis" : {
"filter": {
"edgeNGram_filter": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 20,
"side" : "front"
}
},
"analyzer":{
"edge_nGram_analyzer": {
"type":"custom",
"tokenizer":"edge_ngram_tokenizer",
"filter": [
"lowercase",
"asciifolding",
"edgeNGram_filter"
]
},
"whitespace_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"asciifolding"
]
}
},
"tokenizer" : {
"edge_ngram_tokenizer" : {
"type" : "edgeNGram",
"min_gram" : "2",
"max_gram" : "5",
"token_chars": [ "letter", "digit" ]
}
}
}
},function(err, mapping){
// do neat things here
if(err) {
console.log(err);
}
console.log(mapping);
});
module.exports = Item;
我用Item
(产品)测试了title : cupcake
,如果我在搜索框中输入cup
,我什么也没有得到,但是,如果我写完整的关键字,我得到了对象。
我也不想分析供应商ID和描述,我试着这样做:vendorId: {type:String, index: 'not_analyzed'}
,但是,然后字段停止被索引搜索。
搜索端点的代码:
app.post('/api/search', function(req, res, next) {
Item.search({
query_string: {
query: req.body.keyword
}
},{hydrate:true}, function(err, results) {
// results here
res.send(results);
});
})
您需要为您的title
字段指定要使用的分析器。现在,您只是为每个字段建立索引以便进行搜索,但是您没有将edge_nGram_analyzer
应用于title
字段。您可以使用mongoosastic es_analyzer
属性来实现它,如下所示:
var ItemSchema = new mongoose.Schema({
title: {type: String, es_indexed:true, es_analyzer: 'edge_nGram_analyzer'},
price: Number,
description: {type: String, es_indexed:true},
picture: String,
vendor: {type: String, es_indexed:true},
vendorId: {type:String, es_indexed:true}
});
在你的代码中还有另一个问题,即edge_nGram_analyzer
没有正确指定,你应该删除content
部分,并使其像这样:
"analyzer":{
"edge_nGram_analyzer": {
"type":"custom",
"tokenizer":"edge_ngram_tokenizer",
"filter": [
"lowercase",
"asciifolding",
"edgeNGram_filter"
]
},
...