如何使用Serverless框架与Lambda中的Huggingface深度学习容器进行推理



这是一个来自ML newbee的问题:-)

我正在使用无服务器框架构建AWS StepFunction,其中一个步骤是使用HuggingFace深度学习容器(DLC)部署Sagemaker端点。

问题是我不能使Lambda与SageMaker(构建估计器)一起工作。

我的解决方案之一是使用SageMaker studio手动启动端点,但我真的希望在代码中包含所有内容。

下面是我试图让Sagemaker工作的方法

def installPack(package):
import subprocess
import sys
subprocess.check_call([sys.executable, "-m", "pip", "install", package])
installPack('sagemaker')
from sagemaker.huggingface import HuggingFaceModel
import sagemaker 
role = sagemaker.get_execution_role()
# Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'distilbert-base-uncased-distilled-squad', # model_id from hf.co/models
'HF_TASK':'question-answering' # NLP task you want to use for predictions
}
# create Hugging Face Model Class
huggingface_model = sagemaker.HuggingFaceModel(
env=hub,
role=role, # iam role with permissions to create an Endpoint
transformers_version="4.6", # transformers version used
pytorch_version="1.7", # pytorch version used
py_version="py36", # python version of the DLC
..........

得到的错误是

WARNING: The directory '/home/sbx_user1051/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.

(…那么就会有很多行日志,比如Collecting pyparsing>=2.0.2下载pyparsing-2.4.7-py2.py3-none-any。

Downloading pox-0.3.0-py2.py3-none-any.whl (30 kB)
Collecting multiprocess>=0.70.12
Downloading multiprocess-0.70.12.2-py38-none-any.whl (128 kB)
Using legacy 'setup.py install' for sagemaker, since package 'wheel' is not installed.
Using legacy 'setup.py install' for protobuf3-to-dict, since package 'wheel' is not installed.
Installing collected packages: dill, zipp, pytz, pyparsing, protobuf, ppft, pox, numpy, multiprocess, smdebug-rulesconfig, protobuf3-to-dict, pathos, pandas, packaging, importlib-metadata, google-pasta, attrs, sagemaker
ERROR: Could not install packages due to an OSError: [Errno 30] Read-only file system: '/home/sbx_user1051'

答案发现:

  1. API使用Sagemaker studio进行部署,如下所示
  1. Lambda的推理代码是这样的:
import os, io, boto3, json
ENDPOINT_NAME = "huggingface-pytorch-inference-xxxxxxxxxxxxxxxxx"
runtime= boto3.client('runtime.sagemaker')
inputs = {'inputs': {
'question': 'What is used for inference?',
'context': 'My Name is Philipp and I live in Nuremberg. This model is used with sagemaker for inference.'}
}
payload = json.dumps(inputs, indent=2).encode('utf-8')
print(f"payload: {type(payload)}, {payload}")

response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
ContentType='text/csv',
Body=payload)

最新更新