我正在尝试使用创建一个模式分析器。NET客户端(NEST版本7.17.4和Elastic 8.4(,我得到了以下异常。
Elasticsearch.Net.ElasticsearchClientException: 'Request failed to execute. Call: Status code 400 from: PUT /temp-index-for-integration-tests?pretty=true&error_trace=true. ServerError: Type: pattern_syntax_exception Reason: "Illegal repetition near index 80
([^\p{L}\d]+)|(?<=\D)(?=\d)|(?<=\d)(?=\D)|(?<=[\p{L}&&[^\p{Lu}]])(?=\p{Lu})|(?<=\p{Lu})(?=\p{Lu}[\p{L}&&[^\p{Lu}]])
^"'
这个正则表达式在从开发控制台创建时似乎工作得很好,所以不确定我缺少了什么。
以下作品:
PUT test-index-3
{
"settings": {
"analysis": {
"analyzer": {
"camel": {
"type": "pattern",
"pattern": "([^\p{L}\d]+)|(?<=\D)(?=\d)|(?<=\d)(?=\D)|(?<=[\p{L}&&[^\p{Lu}]])(?=\p{Lu})|(?<=\p{Lu})(?=\p{Lu}[\p{L}&&[^\p{Lu}]])"
}
}
}
}
}
这并没有在Indices.CreateAsync
上引发异常
public async Task IndexCreateAsync(IEnumerable<IFieldDefinition> fieldDefinitions, string indexName)
{
Dictionary<PropertyName, IProperty> indexFields = fieldDefinitions
.Select(f => _elasticsearchMapper.Map(f))
.ToDictionary(p => p.Name, p => p);
PutMappingRequest mappings = new PutMappingRequest(indexName)
{
Properties = new Properties(indexFields)
};
var patternAnalyzer = new PatternAnalyzer
{
Pattern = @"([^\p{L}\d]+)|(?<=\D)(?=\d)|(?<=\d)(?=\D)|(?<=[\p{L}&&[^\p{Lu}]])(?=\p{Lu})|(?<=\p{Lu})(?=\p{Lu}[\p{L}&&[^\p{Lu}]])",
Lowercase = true
};
IndexState indexSettings = new IndexState
{
Mappings = mappings,
Settings = new IndexSettings
{
Analysis = new Analysis
{
Analyzers = new Analyzers
{
{
ElasticsearchConstants.TextAnalysis.CustomAnalyzers.CustomPatternRegexCasesAndSpecialChars,
patternAnalyzer
}
}
}
}
};
await _elasticClient.Indices.CreateAsync(indexName, s => s.InitializeUsing(indexSettings));
}
Ok找到了解决方案。我猜在Elastic云中使用Developer控制台和使用时会有不同的序列化。NET客户端
基本上,我需要做的是用一个斜杠替换所有的
\
字符。
所以当Regex被换进来的时候。NET代码到
var patternAnalyzer = new PatternAnalyzer
{
Pattern = @"([^p{L}d]+)|(?<=D)(?=d)|(?<=d)(?=D)|(?<=[p{L}&&[^p{Lu}]])(?=p{Lu})|(?<=p{Lu})(?=p{Lu}[p{L}&&[^p{Lu}]])"
};
一切都开始工作了!