DBT-对数据透视使用动态数组

我想使用DBT来透视BigQuery表中的列。

由于我有100多个值，我希望我的枢轴列是动态的，我希望这样：

select *
from (  select ceiu.value, ceiu.user_id, ce.name as name
from company_entity_item_user ceiu
left join company_entity ce on ce.id = ceiu.company_entity_id)
PIVOT(STRING_AGG(value) FOR name IN (select distinct name from company_entity))

这里的问题是我不能在IN中使用SELECT语句。我知道我可以将Jinja模板与DBT一起使用，它可能看起来像这样：

...
PIVOT(STRING_AGG(value) FOR name IN ('{{unique_company_entities}}'))
...

但我不知道如何使用SELECT语句来创建这样的变量。此外，由于我使用的是BigQuery，我尝试过使用DECLARE和SET，但如果可能的话，我不知道如何在DBT中使用它们。

感谢的帮助

将@aleix cc的注释提升为一个答案，因为这是在dbt中执行此操作的最佳方式。

要将数据从数据库中获取到jinja上下文中，可以使用dbt内置的run_query宏。或者，如果您不介意使用dbt-utils包，您可以使用该包中的get_column_values宏，它将返回该列中不同值的列表(它还使用{% if execute %}语句保护对run_query的调用，这对于防止dbt编译错误至关重要(。

假设company_entity已经是一个dbt模型，则您的模型变为：

{% set company_list = dbt_utils.get_column_values(table=ref('company_entity'), column='name') %}
# company_list is a jinja list of strings. We need a comma-separated
# list of single-quoted string literals
{% set company_csv = "'" ~ company_list | join("', '") ~ "'" %}
select *
from (  select ceiu.value, ceiu.user_id, ce.name as name
from company_entity_item_user ceiu
left join company_entity ce on ce.id = ceiu.company_entity_id)
PIVOT(STRING_AGG(value) FOR name IN ({{ company_csv }})

相关内容

最新更新

热门标签：