使用发布/订阅的Bigquery数据集



尝试使用pub/sub订阅消息创建数据集。https://cloud.google.com/pubsub/docs/bigquery.这篇文章提供了如何将消息写入表,但我如何使用消息创建数据集。例如,如果我的订阅中有this is test1,那么我应该能够接收消息(test1(并基于该消息创建数据集。只是一个空数据集,不需要任何表。我已经做了所有的研究,没有从谷歌看到任何解决方案。

当消息发布在主题上时,您可以使用Pub-Sub触发器触发云函数:

https://cloud.google.com/functions/docs/tutorials/pubsub

gcloud functions deploy python-pubsub-function 
--gen2 
--runtime=python310 
--region=REGION 
--source=. 
--entry-point=subscribe 
--trigger-topic=YOUR_TOPIC_NAME

您也可以查看此链接:

https://cloud.google.com/functions/docs/calling/pubsub

Cloud Function可以使用Python客户端创建数据集,例如:

from google.cloud import bigquery
import base64
import functions_framework
@functions_framework.cloud_event
def subscribe(cloud_event):
dataset_id_from_topic =  base64.b64decode(cloud_event.data["message"]["data"]).decode()
# Construct a BigQuery client object.
client = bigquery.Client()
# Construct a full Dataset object to     send to the API.
dataset = bigquery.Dataset(dataset_id_from_topic)
# TODO(developer): Specify the geographic location where the dataset should reside.
dataset.location = "US"
# Send the dataset to the API for creation, with an explicit timeout.
# Raises google.api_core.exceptions.Conflict if the Dataset already
# exists within the project.
dataset = client.create_dataset(dataset, timeout=30)  
# Make an API request.
print("Created dataset {}.   {}".format(client.project, dataset.dataset_id))

最新更新