输入"来自"不匹配。期望:",", <expression>



我有一个在AWS athena上运行的查询,它应该返回第二个表中未包含的所有文件名。我基本上试图找到所有不在ejpos着陆表中的文件名。

一个表看起来像这样(项目销售):

<表类> origin_file run_id tbody><<tr>/datarite/ejpos/8023/20220706/filename18035/datarite/ejpos/8023/20220706/filename28035/datarite/ejpos/8023/20220706/filename38035

根据示例数据和声明的目标,您的查询中有很多错误的东西。

  • trim("/datarite/ejpos/8023/20220706/" from "validated"."datarite_ejpos_itemsale" where run_id = '8035') as origin_file不是一个有效的sql。
  • ON "landing"."ejpos_landing_files".filename = "validated"."datarite_ejpos_itemsale".origin_file不能工作,因为有origin_file前缀。如果origin_file中只有一个filename实例,则可以使用strpos
  • 您的连接和过滤条件被构建为查找datarite_ejpos_itemsale中存在而ejpos_landing_files中缺失的项目,而您声明反之亦然。
  • 注释中提到的额外逗号

试试下一个:

-- sample data
WITH item_sales(origin_file, run_id) AS (
VALUES ('/datarite/ejpos/8023/20220706/filename1', 8035),
('/datarite/ejpos/8023/20220706/filename2', 8035),
('/datarite/ejpos/8023/20220706/filename3', 8035),
('/datarite/ejpos/8023/20220706/filename4', 8036)
),
ejpos_files_landing(filename) as(
VALUES ('filename1'),
('filename2'),
('filename3'),
('filename4')
)
-- query
select filename
from ejpos_files_landing l
left outer join item_sales s -- reverse the join
on strpos(s.origin_file, l.filename) >= 1 -- assuming that filename should be present only one time in the string
and s.run_id = 8035 -- if you need to filter out run id
where s.origin_file is null

输出:

tbody> <<tr>
filename
filename4

相关内容

最新更新