通过CodeBuild在AWS Lambda上安装NLTK/WORDNET



我正试图通过CodeBuild让NLTK和Wordnet在lambda上工作。

看起来它在CloudFormation中安装得很好,但我在Lambda中得到了以下错误:

START RequestId: c660c446-e1c4-11e8-8047-15f59f1e002c Version: $LATEST
Unable to import module 'index': No module named 'nltk'
END RequestId: c660c446-e1c4-11e8-8047-15f59f1e002c
REPORT RequestId: c660c446-e1c4-11e8-8047-15f59f1e002c  Duration: 2.10 ms   Billed Duration: 100 ms     Memory Size: 128 MB Max Memory Used: 21 MB  

然而,当我检查时,它在CodeBuild:中安装得很好

[Container] 2018/11/06 12:45:06 Running command pip install -U nltk
Collecting nltk
Downloading https://files.pythonhosted.org/packages/50/09/3b1755d528ad9156ee7243d52aa5cd2b809ef053a0f31b53d92853dd653a/nltk-3.3.0.zip (1.4MB)
Requirement already up-to-date: six in /usr/local/lib/python2.7/site-packages (from nltk)
Building wheels for collected packages: nltk
Running setup.py bdist_wheel for nltk: started
Running setup.py bdist_wheel for nltk: finished with status 'done'
Stored in directory: /root/.cache/pip/wheels/d1/ab/40/3bceea46922767e42986aef7606a600538ca80de6062dc266c
Successfully built nltk
Installing collected packages: nltk
Successfully installed nltk-3.3

以下是实际的python代码:

import json
import datetime
import nltk
from nltk.corpus import wordnet as wn

这是YML文件:

version: 0.2
phases:
install:
commands:
# Upgrade AWS CLI to the latest version
- pip install --upgrade awscli
# Install nltk & WordNet
- pip install -U nltk
- python -m nltk.downloader wordnet
pre_build:
commands:
# Discover and run unit tests in the 'tests' directory. For more information, see <https://docs.python.org/3/library/unittest.html#test-discovery>
# - python -m unittest discover tests
build:
commands:
# Use AWS SAM to package the application by using AWS CloudFormation
- aws cloudformation package --template template.yml --s3-bucket $S3_BUCKET --output-template template-export.yml
artifacts:
type: zip
files:
- template-export.yml

知道为什么它在CodeBuild中安装得很好,但不能访问Lambda中的模块NLTK吗?作为参考,如果只删除NLTK,代码在lambda中运行良好。

我觉得这是一个YML文件问题,但不确定是什么,因为NLTK安装得很好。

NLTK仅在运行CodeBuild作业的机器上本地安装。您需要将NLTK复制到CloudFormation部署包中。然后你的buildspec.yml看起来像这样:

install:
commands:
# Upgrade AWS CLI to the latest version
- pip install --upgrade awscli
pre_build:
commands:
- virtualenv /venv
# Install nltk & WordNet
- pip install -U nltk
- python -m nltk.downloader wordnet
build:
commands:
- cp -r /venv/lib/python3.6/site-packages/. ./
# Use AWS SAM to package the application by using AWS CloudFormation
- aws cloudformation package --template template.yml --s3-bucket $S3_BUCKET --output-template template-export.yml

附加读数:

  • 使用Virtualenv创建的Python环境创建部署包

好的,感谢莱卡为我指明了正确的方向。

这是NLTK&Wordnet通过CodeStar/CodeBuild连接到Lambda。需要记住的一些事项:

1( 不能使用source venv/bin/activate,因为它不符合POSIX。使用. venv/bin/activate,如下所示。

2( 您必须设置NLTK的路径,如定义目录部分所示。

buildspec.yml

version: 0.2
phases:
install:
commands:
# Upgrade AWS CLI & PIP to the latest version
- pip install --upgrade awscli
- pip install --upgrade pip
# Define Directories
- export HOME_DIR=`pwd`
- export NLTK_DATA=$HOME_DIR/nltk_data
pre_build:
commands:
- cd $HOME_DIR
# Create VirtualEnv to package for lambda
- virtualenv venv
- . venv/bin/activate
# Install Supporting Libraries
- pip install -U requests
# Install WordNet
- pip install -U nltk
- python -m nltk.downloader -d $NLTK_DATA wordnet
# Output Requirements
- pip freeze > requirements.txt
# Unit Tests
# - python -m unittest discover tests
build:
commands:
- cd $HOME_DIR
- mv $VIRTUAL_ENV/lib/python3.6/site-packages/* .
# Use AWS SAM to package the application by using AWS CloudFormation
- aws cloudformation package --template template.yml --s3-bucket $S3_BUCKET --output-template template-export.yml
artifacts:
type: zip
files:
- template-export.yml

如果有人有任何改进LMK。它对我有用。

最新更新