我正在Spark SQL中重写Redshift SQL。由于Spark SQL中不支持LISTAGG(),是否有等效的函数或解决方案来实现这一点?
红移SQL:
SELECT
dp_info_id,
dp_type,
CASE
WHEN COALESCE(type,'-1') = 'Primary Name'
THEN LISTAGG(DISTINCT fir_name,'|') WITHIN GROUP (ORDER BY dp_info_id)
ELSE NULL
END AS primary_first_name,
FROM
dp_info c
GROUP BY
dp_info,
type,
dp_type
要从组中获得所有值的数组我猜你应该使用collect_set (https://docs.databricks.com/sql/language-manual/functions/collect_list.html)