ElasticSearch / Nest MatchPhrasePrefix 在版本升级后停止工作



我从:

  • 弹性搜索 2.0 至 6.6.1
  • ElasticSearch.Net Nuget 包 2.4.6 到 6.5.1
  • NEST NuGet 包 2.4.6 到 6.5.1

。和我的 Nest 查询来执行 MatchPhrasePrefix 停止返回结果。

该软件是网页的搜索引擎,其中一项功能应该允许您将结果限制为以特定路径开头的URL,例如http://example.com/blog在搜索结果中仅查看博客文章。

我有一个工作正常的mainQuery。如果用户提供urlStartstWith值,则mainQuery将与bool/MatchPhrasePrefix查询一起进行。

索引包含从 100 到 1000 个文档的任何内容。

我尝试过但不起作用的事情:

  • 完全重建新索引
  • 删除了.Operator(Operator.And),因为它在此版本的 NEST 中不存在(导致编译错误)
  • 将"最大扩展"增加到各种值,最高可达 5000
  • 网址编码urlStartstWith
  • 删除.MinimumShouldMatch(1)

如果我针对旧的 ElasticSearch 2.0 版服务器运行使用新 NEST 库构建的查询,它可以工作。 因此,我认为这是ElasticSearch本身所发生的事情。

查询

var urlStartWithFilter = esQuery.Bool(b =>
b.Filter(m =>
m.MatchPhrasePrefix(pre =>
pre
//.MaxExpansions(5000) //did nothing
//.Operator(Operator.And) //does not exist in new version of NEST
.Query(urlStartsWith)
.Field(f => f.Url))
)
.MinimumShouldMatch(1)
);
mainQuery = mainQuery && urlStartWithFilter;

根据要求 - 从头到尾的示例

这是一个显示问题的示例,非常接近我查询实际项目的网页索引的方式。

运行 ElasticSearch 6.6.1 实例。 您可以通过以下方式在 docker 中执行此操作:

docker pull docker.elastic.co/elasticsearch/elasticsearch:6.6.1
docker network create esnetwork --driver=bridge
docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" --name elasticsearch -d --network esnetwork docker.elastic.co/elasticsearch/elasticsearch:6.6.1

创建新的 .Net Framework 4.6.1 控制台应用。 将以下内容粘贴到Program.cs

using Nest;
using System;
using System.Collections.Generic;
namespace Loader
{
class Program
{
const string ELASTIC_SERVER = "http://localhost:9200";
const string DEFAULT_INDEX = "stack_overflow_api";
private static Uri es_node = new Uri(ELASTIC_SERVER);
private static ConnectionSettings settings = new ConnectionSettings(es_node).DefaultIndex(DEFAULT_INDEX);
private static ElasticClient client = new ElasticClient(settings);
private static bool include_starts_with = true;
static void Main(string[] args)
{
WriteMainMenu();
}

static void WriteMainMenu()
{
//Console.Clear();
Console.WriteLine("");
Console.WriteLine("");
Console.WriteLine("What to do?");
Console.WriteLine("1 - Load Sample Data into ES");
Console.WriteLine("2 - Run a query WITHOUT StartsWith");
Console.WriteLine("3 - Run a query WITH StartsWith");
Console.WriteLine("[Enter] to exit.");
Console.WriteLine("");
Console.WriteLine("");
var option = Console.ReadLine();
if (option == "1")
{
LoadSampleData();
}
else if (option == "2")
{
include_starts_with = false;
RunStartsWithQuery();
}
else if (option == "3")
{
include_starts_with = true;
RunStartsWithQuery();
}
//- exit
}
private static void LoadSampleData()
{
var existsResponse = client.IndexExists(DEFAULT_INDEX);
if (existsResponse.Exists) //delete existing mapping (and data)
{
client.DeleteIndex(DEFAULT_INDEX);
}
var rebuildResponse = client.CreateIndex(DEFAULT_INDEX, c => c.Settings(s => s.NumberOfReplicas(1).NumberOfShards(5)));
var response2 = client.Map<Item>(m => m.AutoMap());
var data = GetSearchResultData();
Console.WriteLine($"Indexing {data.Count} items...");
var response = client.IndexMany<Item>(data);
client.Refresh(DEFAULT_INDEX);
WriteMainMenu();
}
private static List<Item> GetSearchResultData()
{
var jsonPath = System.IO.Path.Combine(Environment.CurrentDirectory, "StackOverflowSampleJson.json");
var jsondata = System.IO.File.ReadAllText(jsonPath);
var searchResult = Newtonsoft.Json.JsonConvert.DeserializeObject<List<Item>>(jsondata);
return searchResult;
}
private static void RunStartsWithQuery()
{
Console.WriteLine("Enter a search query and press enter, or just press enter to search for the default of 'Perl'.");
var search = Console.ReadLine().ToLower();
if (string.IsNullOrWhiteSpace(search))
{
search = "Perl";
}   
Console.WriteLine($"Searching for {search}...");
var result = client.Search<Item>(s => s
.Query(esQuery => {

var titleQuery = esQuery.Match(m => m
.Field(p => p.title)
.Boost(1)
.Query(search)
);
var closedReasonQuery = esQuery.Match(m => m
.Field(p => p.closed_reason)
.Boost(1)
.Query(search)
);
// search across a couple fields
var mainQuery = titleQuery || closedReasonQuery;
if (include_starts_with)
{
var urlStartsWith = "https://stackoverflow.com/questions/";
var urlStartWithFilter = esQuery.Bool(b =>
b.Filter(m =>
m.MatchPhrasePrefix(pre =>
pre
//.MaxExpansions(5000) //did nothing
//.Operator(Operator.And) //does not exist in new version of NEST
.Query(urlStartsWith)
.Field(f => f.link))
)
.MinimumShouldMatch(1)
);
mainQuery = mainQuery && urlStartWithFilter;
}
return mainQuery;
})
);
if (result.IsValid == false)
{
Console.WriteLine("ES Query had an error");
}
else if (result.Hits.Count > 0)
{
Console.ForegroundColor = ConsoleColor.DarkGreen;
Console.WriteLine($"Found {result.Hits.Count} results:");
foreach (var item in result.Hits)
{
Console.WriteLine($"    {item.Source.title}");
}
Console.ForegroundColor = ConsoleColor.White;
}
else
{
Console.ForegroundColor = ConsoleColor.DarkRed;
Console.WriteLine($"Found 0 results");
Console.ForegroundColor = ConsoleColor.White;
}
WriteMainMenu();
}
}

public class Item
{
public List<string> tags { get; set; }
//public Owner owner { get; set; }
public bool is_answered { get; set; }
public int view_count { get; set; }
public int answer_count { get; set; }
public int score { get; set; }
public int last_activity_date { get; set; }
public int creation_date { get; set; }
public int last_edit_date { get; set; }
public int question_id { get; set; }
public string link { get; set; }
public string title { get; set; }
public int? accepted_answer_id { get; set; }
public int? closed_date { get; set; }
public string closed_reason { get; set; }
public int? community_owned_date { get; set; }
}
}
  • 创建一个名为 StackOverflowSampleJson.json 的新文件,并粘贴此示例 JSON: https://pastebin.com/s5rcHysp
  • 的内容
  • 通过右键单击生成目录、选择属性并将Copy to Output Directory更改为Always,将StackOverflowSampleJson.json设置为输出到生成目录
  • 运行应用。
  • 选择1 - Load Sample Data into ES以填充索引
  • 选择2 - Run a query WITHOUT StartsWith以运行没有StartsWith/MatchPhrasePrefix的查询,以查看普通查询是否有效
  • 选择"3 - Run a query WITH StartsWith"以查看包含该额外查询会使结果计数为零。

好的,我真的不明白为什么旧查询适用于 elasticsearch 2.0 而不是 elasticsearch 6.6,但是将内部查询更改为此查询使其可以同时使用 ES2 和 ES6:

if (include_starts_with)
{
var urlStartsWith = "https://stackoverflow.com/questions/";
var urlStartWithFilter = esQuery.MatchPhrasePrefix(pre => pre
.Query(urlStartsWith)
.Field(f => f.link)
);
mainQuery = mainQuery && urlStartWithFilter;
}

最新更新