当我从https://github.com/tensorflow/models/tree/master/official/nlp运行train.py
脚本时,我得到了403权限错误。
python3 official/nlp/train.py --tpu=con-bert1 --experiment=bert/pretraining --mode=train --model_dir=gs://con_bioberturk/general/ --config_file=gs://con_bioberturk/bert_base.yaml --config_file=gs://con_bioberturk/pretrain.yaml --params_override="task.init_checkpoint=gs://con_bioberturk/bert-base-turkish-cased-tf/model.ckpt"`
我的输出如下:
I1115 07:49:02.847452 139877506112576 train_utils.py:368] Saving experiment configuration to gs://con_bioberturk/general/params.yaml
Traceback (most recent call last):
File "/usr/share/tpu/models/official/modeling/hyperparams/params_dict.py", line 349, in save_params_dict_to_yaml
yaml.dump(params.as_dict(), f, default_flow_style=False)
File "/usr/local/lib/python3.8/dist-packages/yaml/__init__.py", line 290, in dump
return dump_all([data], stream, Dumper=Dumper, **kwds)
File "/usr/local/lib/python3.8/dist-packages/yaml/__init__.py", line 278, in dump_all
dumper.represent(data)
File "/usr/local/lib/python3.8/dist-packages/yaml/representer.py", line 28, in represent
self.serialize(node)
File "/usr/local/lib/python3.8/dist-packages/yaml/serializer.py", line 55, in serialize
self.emit(DocumentEndEvent(explicit=self.use_explicit_end))
File "/usr/local/lib/python3.8/dist-packages/yaml/emitter.py", line 115, in emit
self.state()
File "/usr/local/lib/python3.8/dist-packages/yaml/emitter.py", line 220, in expect_document_end
self.flush_stream()
File "/usr/local/lib/python3.8/dist-packages/yaml/emitter.py", line 790, in flush_stream
self.stream.flush()
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/lib/io/file_io.py", line 219, in flush
self._writable_file.flush()
tensorflow.python.framework.errors_impl.PermissionDeniedError: Error executing an HTTP request: HTTP response code 403 with body '{
"error": {
"code": 403,
"message": "Access denied.",
"errors": [
{
"message": "Access denied.",
"domain": "global",
"reason": "forbidden"
}
]
}
}
when initiating an upload to gs://con_bioberturk/general/params.yaml
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "official/nlp/train.py", line 82, in <module>
app.run(main)
File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 308, in run
_run_main(main, args)
File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 254, in _run_main
sys.exit(main(argv))
File "official/nlp/train.py", line 47, in main
train_utils.serialize_config(params, model_dir)
File "/usr/share/tpu/models/official/core/train_utils.py", line 370, in serialize_config
hyperparams.save_params_dict_to_yaml(params, params_save_path)
File "/usr/share/tpu/models/official/modeling/hyperparams/params_dict.py", line 349, in save_params_dict_to_yaml
yaml.dump(params.as_dict(), f, default_flow_style=False)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/lib/io/file_io.py", line 197, in __exit__
self.close()
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/lib/io/file_io.py", line 239, in close
self._writable_file.close()
tensorflow.python.framework.errors_impl.PermissionDeniedError: Error executing an HTTP request: HTTP response code 403 with body '{
"error": {
"code": 403,
"message": "Access denied.",
"errors": [
{
"message": "Access denied.",
"domain": "global",
"reason": "forbidden"
}
]
}
}
'
这是我的设置:
- tpu-vm名称:con-bert1
- 软件版本:TPU -vm-tf-2.10.0-pod
- 云桶(con_bioberturk)和tpu-vm在同一位置
看起来您需要将当前在TPU VM上活动的服务帐户添加到GCS IAM中。这里的说明- https://github.com/google-research/text-to-text-transfer-transformer/issues/1003
如果失败,尝试在TPU VM上运行gcloud auth login --update-adc
来添加凭据。
希望这能解决你的问题。