AWS SageMaker Pipeline Model端点部署失败



我想部署一个有2个容器的Sagemaker管道模型。我指的是:链接:https://sagemaker.readthedocs.io/en/stable/api/inference/pipeline.html.

第一个容器包含图像预处理代码,第二个容器包含模型推理代码。我已经更新了两个容器的docker文件,使其具有以下行:

# Set a docker label to enable container to use SAGEMAKER_BIND_TO_PORT environment variable if present
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

我通过使用单个容器部署正常端点分别测试了这两个容器。两个端点都按照预期部署和工作。但是当我尝试部署管道模型时,端点没有被部署,并给出以下错误:

UnexpectedStatusException: Error hosting endpoint sagemaker-inference-pipeline-endpoint: Failed.
Reason:  The container-1,container-2 for production variant AllTraffic did not pass the ping health check. 
Please check CloudWatch logs for this endpoint..

我已经检查了两个容器的cloudwatch日志,没有显示与"健康检查"相关的错误。失败。请查看1个容器的cloudwatch日志(第二个也一样):

Starting the inference server with 2 workers.
[2022-11-20 14:50:44 +0000] [15] [INFO] Starting gunicorn 20.1.0
[2022-11-20 14:50:44 +0000] [15] [INFO] Listening at: unix:/tmp/gunicorn.sock (15)
[2022-11-20 14:50:44 +0000] [15] [INFO] Using worker: sync
[2022-11-20 14:50:44 +0000] [18] [INFO] Booting worker with pid: 18
[2022-11-20 14:50:44 +0000] [19] [INFO] Booting worker with pid: 19

请注意:为了测试的目的,现在我也更新了我的代码,它做了以下事情:

  • 始终返回健康检查为True(状态200)
  • 每个输入输出内容类型为:"text/plain">

请指导我在不知不觉中错过了什么或在某处犯了错误。提前谢谢你。

我尝试过的事情的总结:

  1. 分别测试两个容器作为端点部署。两个容器都被部署为端点
  2. 我已经阅读了文档部分,知道我们需要告诉docker关于端口绑定。在dockerfile中增加了下面一行:
# Set a docker label to enable container to use SAGEMAKER_BIND_TO_PORT environment variable if present
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
  1. 在两个容器各自的代码文件中更新代码,以始终返回健康检查为通过(状态:200)
  2. 所有输入输出内容类型更新为:"text/plain"(这样即使在容器间通信也不会出现异常)

更新:我能够解决这个问题


实际问题是终端无法ping的容器中。这是因为,当有多个容器时,每个容器都使用一些动态端口进行通信,端点需要知道每个容器使用哪个端口。因此,我们需要编写一个自定义代码,将nginx.conf文件中的端口值[8080]替换为['SAGEMAKER_BIND_TO_PORT']环境变量中的值。

执行上述操作的代码来自这个sagemker示例:https://github.com/aws/amazon-sagemaker-examples/tree/main/contrib/inference_pipeline_custom_containers/containers

在服务文件中,使用下面的start_server()功能:

def start_server():
print('Starting the inference server with {} workers.'.format(model_server_workers))
# link the log streams to stdout/err so they will be logged to the container logs
subprocess.check_call(['ln', '-sf', '/dev/stdout', '/var/log/nginx/access.log'])
subprocess.check_call(['ln', '-sf', '/dev/stderr', '/var/log/nginx/error.log'])

port = os.environ.get("SAGEMAKER_BIND_TO_PORT", 8080)
print("using port: ", port)
with open("nginx.conf.template") as nginx_template:
template = Template(nginx_template.read())    
nginx_conf = open("/opt/program/nginx.conf", "w")
nginx_conf.write(template.substitute(port=port))
nginx_conf.close()
nginx = subprocess.Popen(['nginx', '-c', '/opt/program/nginx.conf'])
gunicorn = subprocess.Popen(['gunicorn',
'--timeout', str(model_server_timeout),
'-k', 'sync',
'-b', 'unix:/tmp/gunicorn.sock',
'-w', str(model_server_workers),
'wsgi:app'])
signal.signal(signal.SIGTERM, lambda a, b: sigterm_handler(nginx.pid, gunicorn.pid))
# If either subprocess exits, so do we.
pids = set([nginx.pid, gunicorn.pid])
while True:
pid, _ = os.wait()
if pid in pids:
break
sigterm_handler(nginx.pid, gunicorn.pid)
print('Inference server exiting')

用nginx.conf.template代替nginx.conf,它会创建一个端口正确的nginx.conf文件:

worker_processes 1;
daemon off; # Prevent forking

pid /tmp/nginx.pid;
error_log /var/log/nginx/error.log;
events {
# defaults
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
access_log /var/log/nginx/access.log combined;

upstream gunicorn {
server unix:/tmp/gunicorn.sock;
}
server {
listen $port deferred;
client_max_body_size 5m;
keepalive_timeout 5;
proxy_read_timeout 1200s;
location ~ ^/(ping|invocations) {
proxy_set_header X-Forwarded-For $$proxy_add_x_forwarded_for;
proxy_set_header Host $$http_host;
proxy_redirect off;
proxy_pass http://gunicorn;
}
location / {
return 404 "{}";
}
}
}

相关内容

  • 没有找到相关文章

最新更新