使用Python弹性客户端插入新文档引发illegal_argument_exception



我在AWS上有一个Elasticsearch服务设置,有一个现有的索引,我正在尝试添加更多的文档。我想使用Python Elasticsearch客户端与此服务交互。我能够成功连接服务并按预期进行查询。但是,当我向Elasticsearch添加新文档时,我收到以下错误:

RequestError: RequestError(400, 'illegal_argument_exception', 'mapper [city] cannot be changed from type [keyword] to [text]')

我是否需要以某种方式指定我添加Elasticsearch的每个文档的映射?我已经搜索了文档,但还没有看到任何这样的例子。我想保持城市字段映射为关键字,但我不知道如何在上传新文档时指定。

这是我当前的进程:

# create auth for AWS version 4
awsauth = AWS4Auth(access_key, secret_key, "us-east-2", "es")
# instantiate the elastic search client
es = Elasticsearch(
hosts = [{'host': host, 'port': 443}],
http_auth = awsauth,
use_ssl = True,
verify_certs = True,
connection_class = RequestsHttpConnection
)
# create a document to upload
data = {'ad_id': 1053674,
'city': 'Houston',
'category': 'Cars',
'date_posted': datetime.datetime(2021, 1, 29, 19, 33),
'title': '2020 Chevrolet Silverado',
'body': "This brand new vehicle is the perfect truck for you.",
'phone': None}
# add document to index
res = es.index(index='ads', doc_type="doc", id=data[0]['ad_id'], body=data[0])
print(res['result'])
RequestError: RequestError(400, 'illegal_argument_exception', 'mapper [city] cannot be changed from type [keyword] to [text]')

注意:这里是es.info()的输出:

{'name': '123456789', 'cluster_name': '123456789:ads', 'cluster_uuid': '123456789', 'version': {'number': '7.9.1', 'build_flavor': 'oss', 'build_type': 'tar', 'build_hash': 'unknown', 'build_date': '2020-11-03T09:54:32.349659Z', 'build_snapshot': False, 'lucene_version': '8.6.2', 'minimum_wire_compatibility_version': '6.8.0', 'minimum_index_compatibility_version': '6.0.0-beta1'}, 'tagline': 'You Know, for Search'}

当您以某种方式修改了摄取文档,Elasticsearch为您的索引自动生成映射,然后您试图摄取不一定符合先前定义的结构(映射)的文档时,会抛出此错误。

查看当前映射,执行命令:

current_mapping = es.indices.get_mapping('ads')

现在,要真正解决原来的问题,删除索引并显式指定映射,这样您就可以完全控制ES索引的结构:

# create a document to upload
data = [{
'ad_id': 1053674,
'city': 'Houston',
'category': 'Cars',
'date_posted': datetime.datetime(2021, 1, 29, 19, 33),
'title': '2020 Chevrolet Silverado',
'body': "This brand new vehicle is the perfect truck for you.",
'phone': None
}]
mapping = '''
{
"mappings" : {
"properties" : {
"ad_id" : {
"type" : "long"
},
"body" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"category" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"city" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"date_posted" : {
"type" : "date"
},
"title" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
'''
# drop the index
# es.indices.delete(index='ads', ignore=[400, 404])
# create the index w/ the mapping
es.indices.create(index='ads', ignore=400, body=mapping)
# add document to index
res = es.index(index='ads', doc_type="_doc", id=data[0]['ad_id'], body=data[0])
print(res['result'])

供参考-如果您打算将city映射为keyword,则在查询时只可能进行精确匹配。

相关内容

最新更新