我想做的事情:
- 运行一些BigQuery查询
- 将结果输出为JSON文件
- 将JSON文件上传到GCS
我是如何做到的:
- 安装并初始化Google Cloud SDK:
gcloud auth activate-service-account --key-file="gcp-credentials.json"
- 启用API:
gcloud services enable
bigquery.googleapis.com
cloudbuild.googleapis.com
cloudfunctions.googleapis.com
cloudscheduler.googleapis.com
pubsub.googleapis.com
serviceusage.googleapis.com
storage-component.googleapis.com
- 编写代码:
src
|__data
|__queries
|__test_query_1.sql
|__test_query_2.sql
|__test_query_3.sql
|__scripts
|__config.py
|__log.txt
|__main.py
|__requirements.txt
requirements.txt
google-cloud-bigquery
google-cloud-storage
config.py:
from pathlib import Path
src_dir = Path(__file__).absolute().parent
config_vars = {
"data_dir": src_dir.parent / "data",
"queries_dir": src_dir.parent / "queries",
"bucket": "...",
}
main.py:
import ...
data_dir = config.config_vars["data_dir"]
queries_dir = config.config_vars["queries_dir"]
def main(data, context):
...
if __name__ == "__main__":
main("data", "context")
因此,main.py
脚本获取查询文件夹中的所有查询,运行它们,将它们输出为JSON,然后将它们上传到名为"的bucket中;test-bucket-20201219";。如果bucket不存在,那么它会创建它。
该脚本在本地运行良好,但当它通过PubSub和Google Scheduler在GCP中部署和调度时,它会运行并创建bucket,但不会上传文件。。。我不确定我做错了什么。任何帮助都将不胜感激。尝试了一切-例如允许PROJECTID@appspot.gserviceaccount.com将对象添加到bucket。
日志记录语句:
2020-12-20 18:43:50,656 | INFO | Uploading test_query_2.json to test-bucket-20201219.
2020-12-20 18:43:50,962 | DEBUG | https://storage.googleapis.com:443 "POST /upload/storage/v1/b/test-bucket-20201219/o?uploadType=multipart HTTP/1.1" 200 776
2020-12-20 18:43:50,963 | INFO | Uploading test_query_3.json to test-bucket-20201219.
2020-12-20 18:43:51,238 | DEBUG | https://storage.googleapis.com:443 "POST /upload/storage/v1/b/test-bucket-20201219/o?uploadType=multipart HTTP/1.1" 200 776
2020-12-20 18:43:51,239 | INFO | Uploading test_query_1.json to test-bucket-20201219.
2020-12-20 18:43:51,466 | DEBUG | https://storage.googleapis.com:443 "POST /upload/storage/v1/b/test-bucket-20201219/o?uploadType=multipart HTTP/1.1" 200 775
感谢大家的帮助-不知怎么的,这也太成功了。我想我错过了第一次运行/部署云函数时自动生成的一个bucket(staging.PROJECT_ID.appspot.com
(。此外,由于我不想将凭据与函数的repo一起存储,所以我使用--service-account
标志以PROJECT_ID@appspot.gserviceaccount.com
的形式从gcloud部署了函数。。。tbh我不完全确定我所做的是否正确,但这对我有效。
无法部署谷歌云功能