Elasticsearch pattern_syntax_exception索引附近存在非法重复



我正在尝试使用创建一个模式分析器。NET客户端(NEST版本7.17.4和Elastic 8.4(,我得到了以下异常。

Elasticsearch.Net.ElasticsearchClientException: 'Request failed to execute. Call: Status code 400 from: PUT /temp-index-for-integration-tests?pretty=true&error_trace=true. ServerError: Type: pattern_syntax_exception Reason: "Illegal repetition near index 80
([^\p{L}\d]+)|(?<=\D)(?=\d)|(?<=\d)(?=\D)|(?<=[\p{L}&&[^\p{Lu}]])(?=\p{Lu})|(?<=\p{Lu})(?=\p{Lu}[\p{L}&&[^\p{Lu}]])
    ^"'

这个正则表达式在从开发控制台创建时似乎工作得很好,所以不确定我缺少了什么。

以下作品:

PUT test-index-3
{
"settings": {
"analysis": {
"analyzer": {
"camel": {
"type": "pattern",
"pattern": "([^\p{L}\d]+)|(?<=\D)(?=\d)|(?<=\d)(?=\D)|(?<=[\p{L}&&[^\p{Lu}]])(?=\p{Lu})|(?<=\p{Lu})(?=\p{Lu}[\p{L}&&[^\p{Lu}]])"
}
}
}
}
}

这并没有在Indices.CreateAsync上引发异常

public async Task IndexCreateAsync(IEnumerable<IFieldDefinition> fieldDefinitions, string indexName)
{
Dictionary<PropertyName, IProperty> indexFields = fieldDefinitions
.Select(f => _elasticsearchMapper.Map(f))
.ToDictionary(p => p.Name, p => p);
PutMappingRequest mappings = new PutMappingRequest(indexName)
{
Properties = new Properties(indexFields)
};
var patternAnalyzer = new PatternAnalyzer
{
Pattern = @"([^\p{L}\d]+)|(?<=\D)(?=\d)|(?<=\d)(?=\D)|(?<=[\p{L}&&[^\p{Lu}]])(?=\p{Lu})|(?<=\p{Lu})(?=\p{Lu}[\p{L}&&[^\p{Lu}]])",
Lowercase = true
};
IndexState indexSettings = new IndexState
{
Mappings = mappings,
Settings = new IndexSettings
{
Analysis = new Analysis
{
Analyzers = new Analyzers
{
{
ElasticsearchConstants.TextAnalysis.CustomAnalyzers.CustomPatternRegexCasesAndSpecialChars,
patternAnalyzer
}
}
}
}
};
await _elasticClient.Indices.CreateAsync(indexName, s => s.InitializeUsing(indexSettings));
}

Ok找到了解决方案。我猜在Elastic云中使用Developer控制台和使用时会有不同的序列化。NET客户端

基本上,我需要做的是用一个斜杠替换所有的\字符。

所以当Regex被换进来的时候。NET代码到

var patternAnalyzer = new PatternAnalyzer
{
Pattern = @"([^p{L}d]+)|(?<=D)(?=d)|(?<=d)(?=D)|(?<=[p{L}&&[^p{Lu}]])(?=p{Lu})|(?<=p{Lu})(?=p{Lu}[p{L}&&[^p{Lu}]])"
};

一切都开始工作了!

最新更新