PythonAirflow operator



我们目前正在使用气流Python操作符将拼花文件从GCS存储加载到BigQuery。我希望能够将源代码中的所有数字列声明为Big numeric,这可能吗?

bq_load = GCSToBigQueryOperator(
task_id="gcs_to_bigquery_modified_airflow",
bucket="{{ dag_run.conf['bucket'] }}",
source_objects=["{{ dag_run.conf['name'] }}"],
source_format ='parquet',
destination_project_dataset_table="{{ task_instance.xcom_pull(task_ids='get_destination') }}", 
create_disposition="CREATE_IF_NEEDED",
write_disposition="WRITE_APPEND",
autodetect=True
)

您可以使用GCSToBigQueryOperatorschema_field参数手动定义模式而不是使用autodetect

请参阅下面的更新代码:

bq_load = GCSToBigQueryOperator(
task_id="gcs_to_bigquery_modified_airflow",
bucket="{{ dag_run.conf['bucket'] }}",
source_objects=["{{ dag_run.conf['name'] }}"],
source_format ='parquet',
destination_project_dataset_table="{{ task_instance.xcom_pull(task_ids='get_destination') }}", 
create_disposition="CREATE_IF_NEEDED",
write_disposition="WRITE_APPEND",
schema_fields=[{"name": "sample_col_1", "type": "BIGNUMERIC", "mode": "NULLABLE"},{"name": "sample_col_2", "type": "BIGNUMERIC", "mode": "NULLABLE"}, {"name": "sample_col_3", "type": "BIGNUMERIC", "mode": "NULLABLE"}]
)

您可以参考GCSToBigQueryOperator文档了解更多详细信息。

相关内容

  • 没有找到相关文章

最新更新