我正在将我们的摄取从使用宁静改为使用德鲁伊-运动-索引服务。然而,当我连接到数据时,它显示的是夹在不可解析字符之间的正确json行:如:
0�{"message"{"ex_json_key" 1}�0,�0{"message"{"ex_json_key" 2}�0
这意味着解析器不能正确解析这些行。我试着摆弄许多输入配置在主管规范,但他们似乎没有什么不同。这似乎不是一个问题,在宁静中使用相同的运动流。有人知道这里的问题是什么和/或解决它的方法吗?
谢谢!
我们的supervisor-spec的缩写版本在这里:
> {
> "type": "kinesis",
> "spec": {
> "dataSchema": {
> "dataSource": "new_source_kinesis",
> "metricsSpec": [
> ],
> "granularitySpec": {
> "segmentGranularity": "hour",
> "queryGranularity": "minute",
> "rollup": true,
> "type": "uniform"
> },
> "dimensionsSpec": {
> "dimensions": [
> "coln"
> ]
> },
> "timestampSpec": {
> "column": "timecol",
> "format": "auto"
> }
> },
> "ioConfig": {
> "stream": "stream_name",
> "inputFormat": {
> "type": "json",
> "flattenSpec": {
> "useFieldDiscovery": true,
> "fields": [
> {
> "type": "path",
> "name": "coln",
> "expr": "$.message.n"
> }
> ]
> }
> },
> "endpoint": "kinesis.us-east-1.amazonaws.com",
> "taskCount": 2
>
> },
> "tuningConfig": {
> "type": "kinesis",
> "reportParseExceptions":true,
> "logParseExceptions":true,
> "intermediatePersistPeriod": "PT10M",
> "maxRowsInMemory": 75000
> }
> }
> }
我们能够通过遵循文档的这一部分来解决这个问题https://druid.apache.org/docs/latest/development/extensions-core/kinesis-ingestion.html deaggregation
我们的步骤是:
- 在supervisor-spec 的ioConfig部分设置"deaggregate": true
- 在middle-managers/coordinator 的kinesis-indexing-service extensions文件夹下添加amazon- kinetic -client 1.9.2
sudo wget https://repo1.maven.org/maven2/com/amazonaws/amazon-kinesis-client/1.9.2/amazon-kinesis-client-1.9.2.jar -P /druid-0.18.1/extensions/druid-kinesis-indexing-service/
- 从druid-0.18.1/lib中删除现有的amazon- kinetic -client 1.13
sudo rm amazon-kinesis-client-1.13.3.jar
(不做这一步,我们是错误的Caused by: java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: com/amazonaws/services/kinesis/model/Record
)
- 重新启动middlemanager/coordinator