AWS Athena 表创建失败,"输入'创建外部'时没有可行的替代方案"



这是我第一次尝试使用雅典娜,请温柔一点:(

此查询没有失败并出现错误->没有可行的替代

CREATE EXTERNAL TABLE IF NOT EXISTS dev.mytokendata (
'dateandtime' timestamp, 'requestid' string, 
'ip' string, 'caller' string, 'token' string, 'requesttime' int, 
'httpmethod' string, 'resourcepath' string, 'status' smallint, 
'protocol' string, 'responselength' int ) 
ROW FORMAT SERDE 'com.amazonaws.glue.serde.GrokSerDe'
WITH SERDEPROPERTIES ( 
'input.format'='%{TIMESTAMP_ISO8601:dateandtime}
s+{"requestId":s+"%{USERNAME:requestid}",
s+"ip":s+"%{IP:ip}",
s+"caller":s+"%{USERNAME:caller}",
s+"token":s+"%{USERNAME:token}",
s+"requestTime":s+"%{INT:requesttime}",
s+"httpMethod":s+"%{WORD:httpmethod}",
s+"resourcePath":s+"%{UNIXPATH:resourcepath}",
s+"status":s+"%{INT:status}",
s+"protocol":s+"%{UNIXPATH:protocol}",
s+"responseLength:"s+"%{INT:responselength}"s+}' ) 
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' 
LOCATION 's3://athena-abc/1234/' 
TBLPROPERTIES ('has_encrypted_data'='false', 'compressionType'='gzip');

这是我试图解析的日志文件(.gz文件(中的一行

2018-07-30T02:23:34.134Z { "requestId":
"810000-9100-1100-a100-f100000", "ip": "01.01.01.001", "caller": "-", 
"token": "1234-5678-78910-abcd", "requestTime": 
"1002917414000", "httpMethod": "POST", "resourcePath": 
"/MyApp/v1.0/MyService.wsdl", 
"status": "200", "protocol": "HTTP/1.1", "responseLength": "1000" }

有人能指出可能出了什么问题吗?这将对有很大帮助

您在列名中不必要地使用了撇号,而且转义符也有问题。

正确的版本:

CREATE EXTERNAL TABLE mytokendata (
`dateandtime` timestamp,
`requestid` string, 
`ip` string,
`caller` string, 
`token` string,
`requesttime` int,  
`httpmethod` string,
`resourcepath` string,
`status` smallint,  
`protocol` string,
`responselength` int ) 
ROW FORMAT SERDE 'com.amazonaws.glue.serde.GrokSerDe'
WITH SERDEPROPERTIES ( 
'input.format'='%{TIMESTAMP_ISO8601:dateandtime}
\s+\{"requestId":\s+"%{USERNAME:requestid}",
\s+"ip":\s+"%{IP:ip}",
\s+"caller":\s+"%{USERNAME:caller}",
\s+"token":\s+"%{USERNAME:token}",
\s+"requestTime":\s+"%{INT:requesttime}",
\s+"httpMethod":\s+"%{WORD:httpmethod}",
\s+"resourcePath":\s+"%{UNIXPATH:resourcepath}",
\s+"status":\s+"%{INT:status}",
\s+"protocol":\s+"%{UNIXPATH:protocol}",
\s+"responseLength:"\s+"%{INT:responselength}"\s+\}' ) 
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' 
LOCATION 's3://athena-abc/1234/'  
TBLPROPERTIES ('has_encrypted_data'='false', 'compressionType'='gzip');

最新更新