将json.gz的命令从s3复制到红移

我们已经将DDB的结果存储为json.gz文件的s3输出。我们想使用copy命令将这些数据传输到红移。我们不想对Redshift进行直接DDB，因为直接复制通常需要扫描操作。这导致读取容量被利用，这是我们想要避免的，因为这些表相当大。我找不到太多关于如何在json.gz文件上使用复制命令的信息。如果有人能找到这样的方法，请告诉我。正如其中一条评论中所建议的那样，我尝试将其视为json
copy itemtable from 's3://bucket/path/file.json.gz' iam_role '<role>' json 'auto ignorecase'
它不起作用。解压缩时，我的文件的格式如下：
{"Item":{"field":{"S":"value"},"field":{"N":"value"}}}'n'{"Item":{"field":{"S":"value"},"field":{"N":"value"}}}'n'
精确错误为
error is Load into table 'itemtable' failed. Check 'stl_load_errors' system table for details

做一些事情对我有用。

from '<input s3 location>'
iam_role '<iam role>'
json '<jsonpath file location>' gzip ACCEPTINVCHARS ' ' TRUNCATECOLUMNS TRIMBLANKS
region '<aws region>'

这里的jsonpath.json文件采用以下格式

{
"jsonpaths": [
"$['Item']['Field1']['S']",
"$['Item']['Field2']['N']",
.
.
.
]
}

该表包含与jsonpath中指定的字段相同的列。

正如John Rotenstein的评论中所建议的那样，copy命令可以处理gzip，我们不需要担心它

相关内容

最新更新

热门标签：