解析错误与简单的CSV导入Clickhouse



我有一个非常简单的CSV,我试图导入到clickhouse没有成功。create table语句为:

CREATE TABLE staging.EloLBK
(
`Month` DateTime64(3),
`1958` Int32,
`1959` Int32,
`1960` Int32
)
ENGINE = MergeTree
PRIMARY KEY Month
ORDER BY Month
SETTINGS index_granularity = 8192

CSV数据如下所示:

"Month", "1958", "1959", "1960"
"JAN",  340,  360,  417
"FEB",  318,  342,  391
"MAR",  362,  406,  419
"APR",  348,  396,  461
"MAY",  363,  420,  472
"JUN",  435,  472,  535
"JUL",  491,  548,  622
"AUG",  505,  559,  606
"SEP",  404,  463,  508
"OCT",  359,  407,  461
"NOV",  310,  362,  390
"DEC",  337,  405,  432

我的import语句是:

INSERT INTO EloLBK SELECT * FROM file('EloLBK/*.csv', 'CSVWithNames', '"Month" datetime64, "1958" integer, "1959" integer, "1960" integer')

从clickhouse返回的错误是:

Code: 27. DB::Exception: Cannot parse input: expected '"' before: '417n"FEB",  318,  342,  391n"MAR",  362,  406,  419n"APR",  348,  396,  461n"MAY",  363,  420,  472n"JUN",  435,  472,  535n"JUL",  491,  548,  622n"AUG",  505,':
Row 1:
Column 0,   name: Month, type: DateTime64(3), parsed text: "<DOUBLE QUOTE>JAN<DOUBLE QUOTE>,  340,  360,  "ERROR
Code: 27. DB::ParsingException: Cannot parse input: expected '"' before: '417n"FEB",  318,  342,  391n"MAR",  362,  406,  419n"APR",  348,  396,  461n"MAY",  363,  420,  472n"JUN",  435,  472,  535n"JUL",  491,  548,  622n"AUG",  505,'. (CANNOT_PARSE_INPUT_ASSERTION_FAILED) (version 21.12.1.8928 (official build))
: While executing CSVRowInputFormat: While executing File. (CANNOT_PARSE_INPUT_ASSERTION_FAILED)

我不知道如何解决这个问题,所以任何建议都会很感激!

您可以通过传递选项date_time_input_format='best_effort'来告诉ClickHouse进行最佳努力猜测,例如:

INSERT INTO EloLBK SELECT * FROM file('EloLBK/*.csv', 'CSVWithNames', '"Month" datetime64, "1958" integer, "1959" integer, "1960" integer') settings date_time_input_format='best_effort';

将导致:

┌───────────────────Month─┬─1958─┬─1959─┬─1960─┐
│ 2000-01-01 00:00:00.000 │  340 │  360 │  417 │
│ 2000-02-01 00:00:00.000 │  318 │  342 │  391 │
│ 2000-03-01 00:00:00.000 │  362 │  406 │  419 │
│ 2000-04-01 00:00:00.000 │  348 │  396 │  461 │
│ 2000-05-01 00:00:00.000 │  363 │  420 │  472 │
│ 2000-06-01 00:00:00.000 │  435 │  472 │  535 │
│ 2000-07-01 00:00:00.000 │  491 │  548 │  622 │
│ 2000-08-01 00:00:00.000 │  505 │  559 │  606 │
│ 2000-09-01 00:00:00.000 │  404 │  463 │  508 │
│ 2000-10-01 00:00:00.000 │  359 │  407 │  461 │
│ 2000-11-01 00:00:00.000 │  310 │  362 │  390 │
│ 2000-12-01 00:00:00.000 │  337 │  405 │  432 │
└─────────────────────────┴──────┴──────┴──────┘

好了,经过一些混乱之后,错误消息似乎只是有点误导。实际的问题是clickhouse(可以理解)无法将月份解析为日期时间。

下面的CSV输入可以正常工作:

"Month","1958","1959","1960"
"1970-01-01T00:00:00","340","360","417"
"1970-01-02T00:00:00","318","342","391"

相关内容

  • 没有找到相关文章

最新更新