更新BQ表中5K行的高效语法



我正在尝试使用python客户端在bq中更新~5K行。

这是我当前的尝试:

update_bq(table_id=_ADS_TO_REMOVE_TABLE_NAME, set_clasue="status ='removed'",
where_in_clause=f'''[{[item['ad_id'] for item in 
current_ads_removed_json_chunk]}]''')

def update_bq(self, table_id, ad_ids, set_clasue, where_in_clause):
table_full_name = self.get_table_full_name(table_id)
query_text = f"""
UPDATE `{table_full_name}`
SET {set_clasue}
WHERE account_id IN {where_in_clause}
"""
query_job = self.client.query(query_text)

我如何将account id list映射到如下看起来更有效的字符串(?)

UPDATE mytable SET somefield=( CASE WHEN (id=100) THEN 'some value removed' WHEN (id=101) THEN 'some value removed' END ) WHERE id IN (100,101);

我试过:

f'''UPDATE mytable SET somefield=( CASE  WHEN id={['(id=100) THEN some value 'removed''.join( item['ad_id'] for item in current_ads_removed_json_chunk]}]+"WHERE id IN ("+ item['ad_id'] for item in current_ads_removed_json_chunk]);

加上bq自动时间戳在更新时工作(在插入时为我工作)。

只是删除所有行与这些id和重新插入新的状态更好吗?

我不知道这是否会更有效,但你可以尝试在python中创建整个sql脚本。脚本可以包含所有的值。值将作为inject_table传递,然后您可以使用单个查询更新所有行。

Doc: inject table and UPDATE FROM

with inject_table (
select 1 as col1, 2 as col2 UNION ALL
select 5 as col1, 3 as col2
)
update tt
set tt.val = it.col1
from target_table as tt
join inject_table as it on tt.col2 = it.col2

最新更新