将Geoshape存储到Elasticsearch中使用来自pig的EShadoop



我正在尝试使用 org.elasticsearch.hadoop.pig.EsStorage (2.2.0) 通过 pig 将地理形状(如下所示)存储到 ES 中:

{
    "location" : {
        "type" : "circle",
        "coordinates" : [-45.0, 45.0],
        "radius" : "100m"
    }
}

或:

{
    "location" : {
        "type" : "polygon",
        "orientation" : "clockwise",
        "coordinates" : [
            [ [-177.0, 10.0], [176.0, 15.0], [172.0, 0.0], [176.0, -15.0], [-177.0, -10.0], [-177.0, 10.0] ],
            [ [178.2, 8.2], [-178.8, 8.2], [-180.8, -8.8], [178.2, 8.8] ]
        ]
    }
}

我们尝试了以下方法:

REGISTER ./elasticsearch-hadoop-2.2.0.jar;
loadedRecords = LOAD 'inputFile.csv' USING PigStorage('|') AS (type:chararray,coordinates:bag{(float,float)},radius:chararray);
elasticData = foreach loadedRecords GENERATE (type ,{(45.0f,46.0f)},radius) AS geoArea:tuple(type:chararray,coordinates:bag{(float,float)},radius:chararray);
DESCRIBE elasticData ;
DUMP elasticData;
STORE elasticData INTO 'myindex/mytype' USING org.elasticsearch.hadoop.pig.EsStorage('es.http.retries=10','es.nodes=localhost','es.index.auto.create=true','es.mapping.pig.tuple.use.field.names=false');

并在解析坐标时收到错误,它遇到了非数值并失败。(类型解析为 CIRCLE)

我们还尝试了以下方法:

我尝试了另一件事,但这也是有问题的:

REGISTER ./elasticsearch-hadoop-2.2.0.jar;
loadedRecords = LOAD 'inputFile.csv' USING PigStorage('|') AS (type:chararray,coordinates:chararray,radius:chararray);
--elasticData = foreach loadedRecords GENERATE (type ,{(45.0f,46.0f)} ,radius) AS geo:tuple(type:chararray,coordinates:bag{(float,float)},radius:chararray;
elasticData = foreach loadedRecords GENERATE TOMAP('type','circle','coordinates','[40.0f,46.0f]','radius','150m') AS geo:map[chararray];
DESCRIBE elasticData ;
DUMP elasticData;
STORE elasticData INTO 'myindex/mytype' USING org.elasticsearch.hadoop.pig.EsStorage('es.http.retries=10','es.nodes=host','es.index.auto.create=true','es.mapping.pig.tuple.use.field.names=false');

收到:

Caused by: com.fasterxml.jackson.core.JsonParseException: Current token (END_OBJECT) not numeric, can not use numeric value accessors
 at [Source: org.elasticsearch.common.io.stream.InputStreamStreamInput@20063f76; line: 1, column: 83]
    at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1581)
    at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:533)
    at com.fasterxml.jackson.core.base.ParserBase._parseNumericValue(ParserBase.java:799)
    at com.fasterxml.jackson.core.base.ParserBase.getDoubleValue(ParserBase.java:713)
    at org.elasticsearch.common.xcontent.json.JsonXContentParser.doDoubleValue(JsonXContentParser.java:180)
    at org.elasticsearch.common.xcontent.support.AbstractXContentParser.doubleValue(AbstractXContentParser.java:184)
    at org.elasticsearch.common.xcontent.support.AbstractXContentParser.doubleValue(AbstractXContentParser.java:174)
    at org.elasticsearch.common.geo.builders.ShapeBuilder.parseCoordinates(ShapeBuilder.java:248)
    at org.elasticsearch.common.geo.builders.ShapeBuilder.access$100(ShapeBuilder.java:46)
    at org.elasticsearch.common.geo.builders.ShapeBuilder$GeoShapeType.parse(ShapeBuilder.java:744)
    at org.elasticsearch.common.geo.builders.ShapeBuilder.parse(ShapeBuilder.java:291)

是否有人使用猪将地形存储到 ES 中并可以帮助我们?

谢谢!

你能显示这个索引的映射吗?前段时间我在 Pig 中的坐标上遇到了类似的问题——我所做的是:

  1. 在 ES 架构中 I 定义的位置

    "location": {   
          "type": "geo_point"   
    }
    
  2. 生成的位置作为元组(经度,纬度)

希望对您有所帮助。

最新更新