DBT只运行模型一次



我已经创建了一个模型来生成一个日历维度,我只想在明确指定要运行它时运行它。

我尝试在is_incremental()块中使用增量物化,如果没有查询来满足临时视图,希望dbt什么也不做。可惜这行不通。

任何建议或想法,我将如何实现这一点非常感谢。

问候,艾希礼

我使用了一个标记。我们把这种东西叫做静态的模型。在你的模型中:

{{ config(tags=['static']) }}

,然后在生产工作中:

dbt run --exclude tag:static

这并不能完全达到您想要的效果,因为您必须在命令行中添加选择器。但是它很简单,而且自文档化,这很好。

我认为你应该能够破解增量物化来做到这一点。DBT会抱怨模型为空,但是您应该能够返回一个没有记录的查询。这将取决于您的RDBMS,如果这真的比仅仅运行模型更好/更快/更便宜,因为dbt仍然会执行具有复杂合并逻辑的查询。

{{ config(materialized='incremental') }}
{% if is_incremental() %}
select * from {{ this }} limit 0
{% else %}
-- your model here, e.g.
{{ dbt_utils.date_spine( ... ) }}
{% endif %}

你的最后/最好的选择可能是创建一个自定义的物化来检查一个存在的关系,如果找到一个则不操作。您可以从增量实体化中借用大部分代码来完成此工作。(您可以将其作为宏添加到项目中)。还没有测试过这个,但是给你一个想法:

-- macros/static_materialization.sql
{% materialization static, default -%}
-- relations
{%- set existing_relation = load_cached_relation(this) -%}
{%- set target_relation = this.incorporate(type='table') -%}
{%- set temp_relation = make_temp_relation(target_relation)-%}
{%- set intermediate_relation = make_intermediate_relation(target_relation)-%}
{%- set backup_relation_type = 'table' if existing_relation is none else existing_relation.type -%}
{%- set backup_relation = make_backup_relation(target_relation, backup_relation_type) -%}
-- configs
{%- set unique_key = config.get('unique_key') -%}
{%- set full_refresh_mode = (should_full_refresh()  or existing_relation.is_view) -%}
{%- set on_schema_change = incremental_validate_on_schema_change(config.get('on_schema_change'), default='ignore') -%}
-- the temp_ and backup_ relations should not already exist in the database; get_relation
-- will return None in that case. Otherwise, we get a relation that we can drop
-- later, before we try to use this name for the current operation. This has to happen before
-- BEGIN, in a separate transaction
{%- set preexisting_intermediate_relation = load_cached_relation(intermediate_relation)-%}
{%- set preexisting_backup_relation = load_cached_relation(backup_relation) -%}
-- grab current tables grants config for comparision later on
{% set grant_config = config.get('grants') %}
{{ drop_relation_if_exists(preexisting_intermediate_relation) }}
{{ drop_relation_if_exists(preexisting_backup_relation) }}
{{ run_hooks(pre_hooks, inside_transaction=False) }}
-- `BEGIN` happens here:
{{ run_hooks(pre_hooks, inside_transaction=True) }}
{% set to_drop = [] %}
{% if existing_relation is none %}
{% set build_sql = get_create_table_as_sql(False, target_relation, sql) %}
{% elif full_refresh_mode %}
{% set build_sql = get_create_table_as_sql(False, intermediate_relation, sql) %}
{% set need_swap = true %}
{% else %}
{# ----- only changed the code between these comments ----- #}
{# NO-OP. An incremental materialization would do a merge here #}
{% set build_sql = "select 1" %}
{# ----- only changed the code between these comments ----- #}
{% endif %}
{% call statement("main") %}
{{ build_sql }}
{% endcall %}
{% if need_swap %}
{% do adapter.rename_relation(target_relation, backup_relation) %}
{% do adapter.rename_relation(intermediate_relation, target_relation) %}
{% do to_drop.append(backup_relation) %}
{% endif %}
{% set should_revoke = should_revoke(existing_relation, full_refresh_mode) %}
{% do apply_grants(target_relation, grant_config, should_revoke=should_revoke) %}
{% do persist_docs(target_relation, model) %}
{% if existing_relation is none or existing_relation.is_view or should_full_refresh() %}
{% do create_indexes(target_relation) %}
{% endif %}
{{ run_hooks(post_hooks, inside_transaction=True) }}
-- `COMMIT` happens here
{% do adapter.commit() %}
{% for rel in to_drop %}
{% do adapter.drop_relation(rel) %}
{% endfor %}
{{ run_hooks(post_hooks, inside_transaction=False) }}
{{ return({'relations': [target_relation]}) }}
{%- endmaterialization %}

我们正在为我们想要运行的每个模型使用dbt run --select MODEL_NAME。因此,在我们的环境中运行的dbt只执行一个模型。通过这样做,您永远不会在意外执行模型的情况下运行。

最新更新