我创建了一个带有硬编码输入的训练脚本。使用培训工作可以正常工作,但我无法使用本地模式。它在我的本地码头上打开一个集装箱,并以代码(1(退出
代码:
estimator = SKLearn(entry_point="train_model.py",
train_instance_type="local")
estimator.fit()
这里有一个例外:
2020-02-22 06:21:05,470 sagemaker-containers INFO Imported framework sagemaker_sklearn_container.training
2020-02-22 06:21:05,480 sagemaker-containers INFO No GPUs detected (normal if no gpus installed)
2020-02-22 06:21:05,504 sagemaker_sklearn_container.training INFO Invoking user training script.
2020-02-22 06:21:06,407 sagemaker-containers ERROR Reporting training FAILURE
2020-02-22 06:21:06,407 sagemaker-containers ERROR framework error:
Traceback (most recent call last):
File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_trainer.py", line 81, in train
entrypoint()
File "/miniconda3/lib/python3.7/site-packages/sagemaker_sklearn_container/training.py", line 36, in main
train(framework.training_env())
File "/miniconda3/lib/python3.7/site-packages/sagemaker_sklearn_container/training.py", line 32, in train
training_environment.to_env_vars(), training_environment.module_name)
File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_modules.py", line 301, in run_module
_files.download_and_extract(uri, _env.code_dir)
File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_files.py", line 129, in download_and_extract
s3_download(uri, dst)
File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_files.py", line 164, in s3_download
s3.Bucket(bucket).download_file(key, dst)
File "/miniconda3/lib/python3.7/site-packages/boto3/s3/inject.py", line 246, in bucket_download_file
ExtraArgs=ExtraArgs, Callback=Callback, Config=Config)
File "/miniconda3/lib/python3.7/site-packages/boto3/s3/inject.py", line 172, in download_file
extra_args=ExtraArgs, callback=Callback)
File "/miniconda3/lib/python3.7/site-packages/boto3/s3/transfer.py", line 307, in download_file
future.result()
File "/miniconda3/lib/python3.7/site-packages/s3transfer/futures.py", line 106, in result
return self._coordinator.result()
File "/miniconda3/lib/python3.7/site-packages/s3transfer/futures.py", line 265, in result
raise self._exception
File "/miniconda3/lib/python3.7/site-packages/s3transfer/tasks.py", line 255, in _main
self._submit(transfer_future=transfer_future, **kwargs)
File "/miniconda3/lib/python3.7/site-packages/s3transfer/download.py", line 345, in _submit
**transfer_future.meta.call_args.extra_args
File "/miniconda3/lib/python3.7/site-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/miniconda3/lib/python3.7/site-packages/botocore/client.py", line 661, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
An error occurred (403) when calling the HeadObject operation: Forbidden
tmpe_msr8pi_algo-1-kt1vh_1 exited with code 1
我发现docker重启解决了这个问题。过了一段时间,它又发生了——它又解决了这个问题。
我使用的是windows的docker,这个问题可能与创建的容器配置有关