Python 在 amazon lambda 中使用 moviepy、scipy 和 numpy
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34749806/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using moviepy, scipy and numpy in amazon lambda
提问by rouk1
I'd like to generate video using AWS Lambda
feature.
我想使用AWS Lambda
功能生成视频。
I've followed instructions found hereand here.
And I now have the following process to build my Lambda
function:
我现在有以下过程来构建我的Lambda
功能:
Step 1
第1步
Fire a Amazon Linux EC2
instance and run this as root on it:
启动一个Amazon Linux EC2
实例并以 root 身份运行它:
#! /usr/bin/env bash
# Install the SciPy stack on Amazon Linux and prepare it for AWS Lambda
yum -y update
yum -y groupinstall "Development Tools"
yum -y install blas --enablerepo=epel
yum -y install lapack --enablerepo=epel
yum -y install atlas-sse3-devel --enablerepo=epel
yum -y install Cython --enablerepo=epel
yum -y install python27
yum -y install python27-numpy.x86_64
yum -y install python27-numpy-f2py.x86_64
yum -y install python27-scipy.x86_64
/usr/local/bin/pip install --upgrade pip
mkdir -p /home/ec2-user/stack
/usr/local/bin/pip install moviepy -t /home/ec2-user/stack
cp -R /usr/lib64/python2.7/dist-packages/numpy /home/ec2-user/stack/numpy
cp -R /usr/lib64/python2.7/dist-packages/scipy /home/ec2-user/stack/scipy
tar -czvf stack.tgz /home/ec2-user/stack/*
Step 2
第2步
I scp the resulting tarball to my laptop. And then run this script to build a zip archive.
我将生成的 tarball scp 到我的笔记本电脑。然后运行此脚本以构建 zip 存档。
#! /usr/bin/env bash
mkdir tmp
rm lambda.zip
tar -xzf stack.tgz -C tmp
zip -9 lambda.zip process_movie.py
zip -r9 lambda.zip *.ttf
cd tmp/home/ec2-user/stack/
zip -r9 ../../../../lambda.zip *
process_movie.py
script is at the moment only a test to see if the stack is ok:
process_movie.py
脚本目前只是一个测试,看看堆栈是否正常:
def make_movie(event, context):
import os
print(os.listdir('.'))
print(os.listdir('numpy'))
try:
import scipy
except ImportError:
print('can not import scipy')
try:
import numpy
except ImportError:
print('can not import numpy')
try:
import moviepy
except ImportError:
print('can not import moviepy')
Step 3
第 3 步
Then I upload the resulting archive to S3 to be the source of my lambda
function.
When I test the function I get the following callstack
:
然后我将生成的存档上传到 S3 作为我的lambda
函数的来源。当我测试该功能时,我得到以下信息callstack
:
START RequestId: 36c62b93-b94f-11e5-9da7-83f24fc4b7ca Version: $LATEST
['tqdm', 'imageio-1.4.egg-info', 'decorator.pyc', 'process_movie.py', 'decorator-4.0.6.dist-info', 'imageio', 'moviepy', 'tqdm-3.4.0.dist-info', 'scipy', 'numpy', 'OpenSans-Regular.ttf', 'decorator.py', 'moviepy-0.2.2.11.egg-info']
['add_newdocs.pyo', 'numarray', '__init__.py', '__config__.pyc', '_import_tools.py', 'setup.pyo', '_import_tools.pyc', 'doc', 'setupscons.py', '__init__.pyc', 'setup.py', 'version.py', 'add_newdocs.py', 'random', 'dual.pyo', 'version.pyo', 'ctypeslib.pyc', 'version.pyc', 'testing', 'dual.pyc', 'polynomial', '__config__.pyo', 'f2py', 'core', 'linalg', 'distutils', 'matlib.pyo', 'tests', 'matlib.pyc', 'setupscons.pyc', 'setup.pyc', 'ctypeslib.py', 'numpy', '__config__.py', 'matrixlib', 'dual.py', 'lib', 'ma', '_import_tools.pyo', 'ctypeslib.pyo', 'add_newdocs.pyc', 'fft', 'matlib.py', 'setupscons.pyo', '__init__.pyo', 'oldnumeric', 'compat']
can not import scipy
'module' object has no attribute 'core': AttributeError
Traceback (most recent call last):
File "/var/task/process_movie.py", line 91, in make_movie
import numpy
File "/var/task/numpy/__init__.py", line 122, in <module>
from numpy.__config__ import show as show_config
File "/var/task/numpy/numpy/__init__.py", line 137, in <module>
import add_newdocs
File "/var/task/numpy/numpy/add_newdocs.py", line 9, in <module>
from numpy.lib import add_newdoc
File "/var/task/numpy/lib/__init__.py", line 13, in <module>
from polynomial import *
File "/var/task/numpy/lib/polynomial.py", line 11, in <module>
import numpy.core.numeric as NX
AttributeError: 'module' object has no attribute 'core'
END RequestId: 36c62b93-b94f-11e5-9da7-83f24fc4b7ca
REPORT RequestId: 36c62b93-b94f-11e5-9da7-83f24fc4b7ca Duration: 112.49 ms Billed Duration: 200 ms Memory Size: 1536 MB Max Memory Used: 14 MB
I cant understand why python does not found the core directory that is present in the folder structure.
我不明白为什么 python 找不到文件夹结构中存在的核心目录。
EDIT:
编辑:
Following @jarmod advice I've reduced the lambda
function to:
按照@jarmod 的建议,我将lambda
功能简化为:
def make_movie(event, context):
print('running make movie')
import numpy
I now have the following error:
我现在有以下错误:
START RequestId: 6abd7ef6-b9de-11e5-8aee-918ac0a06113 Version: $LATEST
running make movie
Error importing numpy: you should not try to import numpy from
its source directory; please exit the numpy source tree, and relaunch
your python intepreter from there.: ImportError
Traceback (most recent call last):
File "/var/task/process_movie.py", line 3, in make_movie
import numpy
File "/var/task/numpy/__init__.py", line 127, in <module>
raise ImportError(msg)
ImportError: Error importing numpy: you should not try to import numpy from
its source directory; please exit the numpy source tree, and relaunch
your python intepreter from there.
END RequestId: 6abd7ef6-b9de-11e5-8aee-918ac0a06113
REPORT RequestId: 6abd7ef6-b9de-11e5-8aee-918ac0a06113 Duration: 105.95 ms Billed Duration: 200 ms Memory Size: 1536 MB Max Memory Used: 14 MB
采纳答案by rouk1
With the help of all posts in this thread here is a solution for the records:
在此线程中所有帖子的帮助下,这里是记录的解决方案:
To get this to work you'll need to:
要使其工作,您需要:
start a
EC2
instance with at least 2GO RAM (to be able to compileNumPy
&SciPy
)Install the needed dependencies
sudo yum -y update sudo yum -y upgrade sudo yum -y groupinstall "Development Tools" sudo yum -y install blas --enablerepo=epel sudo yum -y install lapack --enablerepo=epel sudo yum -y install Cython --enablerepo=epel sudo yum install python27-devel python27-pip gcc virtualenv ~/env source ~/env/bin/activate pip install scipy pip install numpy pip install moviepy
Copy to your locale machine all the content of the directories (except _markerlib, pip*, pkg_resources, setuptools* and easyinstall*) in a
stack
folder:home/ec2-user/env/lib/python2.7/dist-packages
home/ec2-user/env/lib64/python2.7/dist-packages
get all required shared libraries from you
EC2
instance:libatlas.so.3
libf77blas.so.3
liblapack.so.3
libptf77blas.so.3
libcblas.so.3
libgfortran.so.3
libptcblas.so.3
libquadmath.so.0
Put them in a
lib
subfolder of thestack
folderimageio
is a dependency ofmoviepy
, you'll need to download some binary version of its dependencies:libfreeimage
and offfmpeg
; they can be found here. Put them at the root of your stack folder and renamelibfreeimage-3.16.0-linux64.so
tolibfreeimage.so
You should now have a
stack
folder containing:- all python dependencies at root
- all shared libraries in a
lib
subfolder ffmpeg
binary at rootlibfreeimage.so
at root
Zip this folder:
zip -r9 stack.zip . -x ".*" -x "*/.*"
Use the following
lambda_function.py
as an entry point for yourlambda
from __future__ import print_function import os import subprocess SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) LIB_DIR = os.path.join(SCRIPT_DIR, 'lib') FFMPEG_BINARY = os.path.join(SCRIPT_DIR, 'ffmpeg') def lambda_handler(event, context): command = 'LD_LIBRARY_PATH={} IMAGEIO_FFMPEG_EXE={} python movie_maker.py'.format( LIB_DIR, FFMPEG_BINARY, ) try: output = subprocess.check_output(command, shell=True) print(output) except subprocess.CalledProcessError as e: print(e.output)
write a
movie_maker.py
script that depends onmoviepy
,numpy
, ...add those to script to your stack.zip file
zip -r9 lambda.zip *.py
upload the zip to
S3
and use it as a source for yourlambda
启动一个
EC2
至少有 2GO RAM的实例(以便能够编译NumPy
&SciPy
)安装所需的依赖项
sudo yum -y update sudo yum -y upgrade sudo yum -y groupinstall "Development Tools" sudo yum -y install blas --enablerepo=epel sudo yum -y install lapack --enablerepo=epel sudo yum -y install Cython --enablerepo=epel sudo yum install python27-devel python27-pip gcc virtualenv ~/env source ~/env/bin/activate pip install scipy pip install numpy pip install moviepy
将
stack
文件夹中的所有目录内容(_markerlib、pip*、pkg_resources、setuptools* 和 easyinstall* 除外)复制到您的语言环境机器:home/ec2-user/env/lib/python2.7/dist-packages
home/ec2-user/env/lib64/python2.7/dist-packages
从您的
EC2
实例中获取所有必需的共享库:libatlas.so.3
libf77blas.so.3
liblapack.so.3
libptf77blas.so.3
libcblas.so.3
libgfortran.so.3
libptcblas.so.3
libquadmath.so.0
将它们放在
lib
文件夹的子stack
文件夹中imageio
是 的依赖项moviepy
,您需要下载其依赖项的一些二进制版本:libfreeimage
和ffmpeg
; 他们可以在这里找到。将它们放在堆栈文件夹的根目录并重命名libfreeimage-3.16.0-linux64.so
为libfreeimage.so
您现在应该有一个
stack
包含以下内容的文件夹:- 根目录下的所有 python 依赖项
lib
子文件夹中的所有共享库ffmpeg
二进制根libfreeimage.so
从根本上
压缩这个文件夹:
zip -r9 stack.zip . -x ".*" -x "*/.*"
使用以下
lambda_function.py
作为您的入口点lambda
from __future__ import print_function import os import subprocess SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) LIB_DIR = os.path.join(SCRIPT_DIR, 'lib') FFMPEG_BINARY = os.path.join(SCRIPT_DIR, 'ffmpeg') def lambda_handler(event, context): command = 'LD_LIBRARY_PATH={} IMAGEIO_FFMPEG_EXE={} python movie_maker.py'.format( LIB_DIR, FFMPEG_BINARY, ) try: output = subprocess.check_output(command, shell=True) print(output) except subprocess.CalledProcessError as e: print(e.output)
编写一个
movie_maker.py
依赖于moviepy
,numpy
, ...的脚本将这些添加到您的 stack.zip 文件中的脚本中
zip -r9 lambda.zip *.py
将 zip 上传到
S3
并将其用作您的源lambda
You can also download the stack.zip
here.
您也可以在stack.zip
此处下载.
回答by Attila Tanyi
I was also following your first link and managed to import numpyand pandasin a Lambda function this way (on Windows):
我也在关注你的第一个链接,并设法以这种方式(在 Windows 上)在 Lambda 函数中导入numpy和pandas:
- Started a (free-tier) t2.microEC2 instancewith 64-bit Amazon Linux AMI 2015.09.1 and used Putty to SSH in.
Tried the same commandsyou used and the one recommended by the Amazon article:
sudo yum -y update sudo yum -y upgrade sudo yum -y groupinstall "Development Tools" sudo yum -y install blas --enablerepo=epel sudo yum -y install lapack --enablerepo=epel sudo yum -y install Cython --enablerepo=epel sudo yum install python27-devel python27-pip gcc
Created the virtual environment:
virtualenv ~/env source ~/env/bin/activate
Installed the packages:
sudo ~/env/bin/pip2.7 install numpy sudo ~/env/bin/pip2.7 install pandas
Then, using WinSCP, I logged in and downloadedeverything (except _markerlib, pip*, pkg_resources, setuptools* and easyinstall*) from
/home/ec2-user/env/lib/python2.7/dist-packages
, and everything from/home/ec2-user/env/lib64/python2.7/site-packages
from the EC2 instance.I put all these folders and files into one zip, along with the .py file containing the Lambda function. illustration of all files copied
Because this .zip is larger than 10 MB, I created an S3 bucketto store the file. I copied the link of the file from there and pasted at "Upload a .ZIP from Amazon S3" at the Lambda function.
The EC2 instance can be shut down, it's not needed any more.
- 使用 64 位 Amazon Linux AMI 2015.09.1启动(免费层)t2.micro EC2 实例并使用 Putty 进行 SSH 连接。
尝试了您使用的相同命令以及亚马逊文章推荐的命令:
sudo yum -y update sudo yum -y upgrade sudo yum -y groupinstall "Development Tools" sudo yum -y install blas --enablerepo=epel sudo yum -y install lapack --enablerepo=epel sudo yum -y install Cython --enablerepo=epel sudo yum install python27-devel python27-pip gcc
创建虚拟环境:
virtualenv ~/env source ~/env/bin/activate
安装了软件包:
sudo ~/env/bin/pip2.7 install numpy sudo ~/env/bin/pip2.7 install pandas
然后,使用 WinSCP,我登录并下载了所有内容(除了 _markerlib、pip*、pkg_resources、setuptools* 和 easyinstall*)
/home/ec2-user/env/lib/python2.7/dist-packages
,以及/home/ec2-user/env/lib64/python2.7/site-packages
来自 EC2 实例的所有内容。我将所有这些文件夹和文件以及包含 Lambda 函数的 .py 文件放在一个zip文件中。 复制的所有文件的插图
由于此 .zip 大于 10 MB,因此我创建了一个S3 存储桶来存储该文件。我从那里复制了文件的链接并粘贴到 Lambda 函数的“从 Amazon S3 上传 .ZIP”。
EC2 实例可以关闭,不再需要。
With this, I could import numpy and pandas. I'm not familiar with moviepy, but scipy might already be tricky as Lambda has a limitfor unzipped deployment package size at 262 144 000 bytes. I'm afraid numpy and scipy together are already over that.
有了这个,我可以导入 numpy 和 pandas。我不熟悉moviepy,但SciPy的可能已经是棘手,因为拉姆达具有限制在262 144 000个字节的解压缩部署包大小。恐怕 numpy 和 scipy 在一起已经结束了。
回答by Vito Limandibhrata
The posts here help me to find a way to statically compile NumPy with libraries files that can be included in the AWS Lambda Deployment package. This solution does not depend on LD_LIBRARY_PATH value as in @rouk1 solution.
此处的帖子帮助我找到了一种使用可包含在 AWS Lambda 部署包中的库文件静态编译 NumPy 的方法。此解决方案不依赖于@rouk1 解决方案中的 LD_LIBRARY_PATH 值。
The compiled NumPy library can be downloaded from https://github.com/vitolimandibhrata/aws-lambda-numpy
编译后的 NumPy 库可以从https://github.com/vitolimandibhrata/aws-lambda-numpy下载
Here are the steps to custom compile NumPy
以下是自定义编译 NumPy 的步骤
Instructions on compiling this package from scratch
从头开始编译这个包的说明
Prepare a fresh AWS EC instance with AWS Linux.
使用 AWS Linux 准备一个全新的 AWS EC 实例。
Install compiler dependencies
安装编译器依赖
sudo yum -y install python-devel
sudo yum -y install gcc-c++
sudo yum -y install gcc-gfortran
sudo yum -y install libgfortran
Install NumPy dependencies
安装 NumPy 依赖项
sudo yum -y install blas
sudo yum -y install lapack
sudo yum -y install atlas-sse3-devel
Create /var/task/lib to contain the runtime libraries
创建 /var/task/lib 以包含运行时库
mkdir -p /var/task/lib
/var/task is the root directory where your code will reside in AWS Lambda thus we need to statically link the required library files in a well known folder which in this case /var/task/lib
/var/task 是您的代码将驻留在 AWS Lambda 中的根目录,因此我们需要将所需的库文件静态链接到一个众所周知的文件夹中,在本例中为 /var/task/lib
Copy the following library files to the /var/task/lib
将以下库文件复制到 /var/task/lib
cp /usr/lib64/atlas-sse3/liblapack.so.3 /var/task/lib/.
cp /usr/lib64/atlas-sse3/libptf77blas.so.3 /var/task/lib/.
cp /usr/lib64/atlas-sse3/libf77blas.so.3 /var/task/lib/.
cp /usr/lib64/atlas-sse3/libptcblas.so.3 /var/task/lib/.
cp /usr/lib64/atlas-sse3/libcblas.so.3 /var/task/lib/.
cp /usr/lib64/atlas-sse3/libatlas.so.3 /var/task/lib/.
cp /usr/lib64/atlas-sse3/libptf77blas.so.3 /var/task/lib/.
cp /usr/lib64/libgfortran.so.3 /var/task/lib/.
cp /usr/lib64/libquadmath.so.0 /var/task/lib/.
Get the latest numpy source code from http://sourceforge.net/projects/numpy/files/NumPy/
从http://sourceforge.net/projects/numpy/files/NumPy/获取最新的 numpy 源代码
Go to the numpy source code folder e.g numpy-1.10.4 Create a site.cfg file with the following entries
转到 numpy 源代码文件夹,例如 numpy-1.10.4 使用以下条目创建一个 site.cfg 文件
[atlas]
libraries=lapack,f77blas,cblas,atlas
search_static_first=true
runtime_library_dirs = /var/task/lib
extra_link_args = -lgfortran -lquadmath
-lgfortran -lquadmath flags are required to statically link gfortran and quadmath libraries with files defined in runtime_library_dirs
-lgfortran -lquadmath 标志需要静态链接 gfortran 和 quadmath 库与 runtime_library_dirs 中定义的文件
Build NumPy
构建 NumPy
python setup.py build
Install NumPy
安装 NumPy
python setup.py install
Check whether the libraries are linked to the files in /var/task/lib
检查库是否链接到 /var/task/lib 中的文件
ldd $PYTHON_HOME/lib64/python2.7/site-packages/numpy/linalg/lapack_lite.so
You should see
你应该看到
linux-vdso.so.1 => (0x00007ffe0dd2d000)
liblapack.so.3 => /var/task/lib/liblapack.so.3 (0x00007ffad6be5000)
libptf77blas.so.3 => /var/task/lib/libptf77blas.so.3 (0x00007ffad69c7000)
libptcblas.so.3 => /var/task/lib/libptcblas.so.3 (0x00007ffad67a7000)
libatlas.so.3 => /var/task/lib/libatlas.so.3 (0x00007ffad6174000)
libf77blas.so.3 => /var/task/lib/libf77blas.so.3 (0x00007ffad5f56000)
libcblas.so.3 => /var/task/lib/libcblas.so.3 (0x00007ffad5d36000)
libpython2.7.so.1.0 => /usr/lib64/libpython2.7.so.1.0 (0x00007ffad596d000)
libgfortran.so.3 => /var/task/lib/libgfortran.so.3 (0x00007ffad5654000)
libm.so.6 => /lib64/libm.so.6 (0x00007ffad5352000)
libquadmath.so.0 => /var/task/lib/libquadmath.so.0 (0x00007ffad5117000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007ffad4f00000)
libc.so.6 => /lib64/libc.so.6 (0x00007ffad4b3e000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ffad4922000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007ffad471d000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007ffad451a000)
/lib64/ld-linux-x86-64.so.2 (0x000055cfc3ab8000)
回答by sangheestyle
I like @Vito Limandibhrata's answer but I think it's not enough to build numpy with runtime_library_dirs in numpy==1.11.1. If anybody think site-cfg is ignored, do the following:
我喜欢@Vito Limandibhrata 的回答,但我认为在 numpy==1.11.1 中使用 runtime_library_dirs 构建 numpy 是不够的。如果有人认为 site-cfg 被忽略,请执行以下操作:
cp /usr/lib64/atlas-sse3/*.a /var/task/lib/
*.a files under atlas-sse3 are needed to build numpy. Also, you might need to run the following:
需要 atlas-sse3 下的 *.a 文件来构建 numpy。此外,您可能需要运行以下命令:
python setup.py config
to check numpy configuration. If it requires something more, you will see the following message:
检查 numpy 配置。如果需要更多内容,您将看到以下消息:
atlas_threads_info:
Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /root/Envs/skl/lib
libraries lapack_atlas not found in /root/Envs/skl/lib
libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib64
libraries lapack_atlas not found in /usr/local/lib64
libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib
libraries lapack_atlas not found in /usr/local/lib
libraries lapack_atlas not found in /usr/lib64/atlas-sse3
<class 'numpy.distutils.system_info.atlas_threads_info'>
Setting PTATLAS=ATLAS
Setting PTATLAS=ATLAS
Setting PTATLAS=ATLAS
Setting PTATLAS=ATLAS
libraries lapack not found in ['/var/task/lib']
Runtime library lapack was not found. Ignoring
libraries f77blas not found in ['/var/task/lib']
Runtime library f77blas was not found. Ignoring
libraries cblas not found in ['/var/task/lib']
Runtime library cblas was not found. Ignoring
libraries atlas not found in ['/var/task/lib']
Runtime library atlas was not found. Ignoring
FOUND:
extra_link_args = ['-lgfortran -lquadmath']
define_macros = [('NO_ATLAS_INFO', -1)]
language = f77
libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas', 'lapack', 'f77blas', 'cblas', 'atlas']
library_dirs = ['/usr/lib64/atlas-sse3']
include_dirs = ['/usr/include']
then site-cfg is going to be ignored.
那么 site-cfg 将被忽略。
Tip: If pip is used to build numpy with runtime_library_dirs, you would better create ~/.numpy-site.cfg
and add the following:
提示:如果使用 pip 使用 runtime_library_dirs 构建 numpy,则最好创建~/.numpy-site.cfg
并添加以下内容:
[atlas]
libraries = lapack,f77blas,cblas,atlas
search_static_first = true
runtime_library_dirs = /var/task/lib
extra_link_args = -lgfortran -lquadmath
then numpy recognizes .numpy-site.cfg file. It's quite simple and easy way.
然后 numpy 识别 .numpy-site.cfg 文件。这是非常简单和容易的方法。
回答by johncip
Another, very simple method that's possible these days is to build using the awesome docker containers that LambCI made to mimic Lambda: https://github.com/lambci/docker-lambda
现在可能的另一种非常简单的方法是使用 LambCI 模仿 Lambda 制作的很棒的 docker 容器来构建:https: //github.com/lambci/docker-lambda
The lambci/lambda:build
container resembles AWS Lambda with the addition of a mostly-complete build environment. To start a shell session in it:
该lambci/lambda:build
容器类似于 AWS Lambda,但增加了一个基本完整的构建环境。要在其中启动 shell 会话:
docker run -v "$PWD":/var/task -it lambci/lambda:build bash
Inside the session:
会议内部:
export share=/var/task
easy_install pip
pip install -t $share numpy
Or, with virtualenv:
或者,使用 virtualenv:
export share=/var/task
export PS1="[\u@\h:\w]$ " # required by virtualenv
easy_install pip
pip install virtualenv
# ... make the venv, install numpy, and copy it to $share
Later on you can use the main lambci/lambda container to test your build.
稍后您可以使用主 lambci/lambda 容器来测试您的构建。
回答by wrwrwr
As of 2017, NumPy and SciPy have wheels that work on Lambda (the packages include precompiled libgfortran
and libopenblas
).
As far as I know, MoviePy is a pure Python module, so basically you could do:
截至 2017 年,NumPy 和 SciPy 具有适用于 Lambda 的轮子(包包括预编译libgfortran
和libopenblas
)。据我所知,MoviePy 是一个纯 Python 模块,所以基本上你可以这样做:
pip2 install -t lambda moviepy scipy
Then copy your handler into the lambda
directory and zip it. Except, that you'll most likely exceed the 50/250 MB size limits. There are a couple of things that can help:
然后将您的处理程序复制到lambda
目录中并压缩它。除此之外,您很可能会超过 50/250 MB 的大小限制。有几件事可以提供帮助:
- remove .pycs, docs, tests and other unnecessary parts;
- leave a single copy of common libraries of NumPy and SciPy;
- strip libraries of inessential pieces, such as debugging symbols;
- compress the archive using higher settings.
- 删除 .pycs、文档、测试和其他不必要的部分;
- 保留一份 NumPy 和 SciPy 公共库的副本;
- 剥离无关紧要的库,例如调试符号;
- 使用更高的设置压缩存档。
Here's an example scriptthat automates the above points.
这是一个自动执行上述要点的示例脚本。
回答by Jay Carroll
I can confirm that the steps posted by @attila-tanyi work correctly under Amazon Linux. I would only add that there is no need to use an EC2, as there is an Amazon Linux docker container available from the default repository.
我可以确认@attila-tanyi 发布的步骤在 Amazon Linux 下正常工作。我只想补充一点,不需要使用 EC2,因为默认存储库中有一个 Amazon Linux docker 容器。
docker pull amazonlinux && docker run -it amazonlinux
# Follow @attila-tanyi steps
# Note - sudo is not necessary here
I use the Dockerfile embedded in my application to build and deploy to Lambda.
我使用嵌入在我的应用程序中的 Dockerfile 来构建和部署到 Lambda。
回答by Abhishek Gaur
As of 2018, Steps to install external modules in Python3 on AWS EC2:
截至 2018 年,在 AWS EC2 上的 Python3 中安装外部模块的步骤:
Launch EC2 on Amazon Linux AMI 201709.
ssh with putty using private and public key and become super user.
Install Python 3 and create virtual env, then make it default
yum install python36 python36-virtualenv python36-pip virtualenv -p python3.6 /tmp/my_python_lib source /tmp/my_python_lib/bin/activate which python --to check which version s installed pip3 install numpy
Copy files under site packages and dist packages into your local machhine using winscp.
To find actual location use grep commands ---
grep -r dist-packages *.
在 Amazon Linux AMI 201709 上启动 EC2。
ssh with putty 使用私钥和公钥并成为超级用户。
安装 Python 3 并创建虚拟环境,然后将其设为默认值
yum install python36 python36-virtualenv python36-pip virtualenv -p python3.6 /tmp/my_python_lib source /tmp/my_python_lib/bin/activate which python --to check which version s installed pip3 install numpy
使用 winscp 将站点包和 dist 包下的文件复制到本地机器中。
要查找实际位置,请使用 grep 命令 ---
grep -r dist-packages *.
These packages could be inside both lib and lib64.
这些包可以在 lib 和 lib64 中。
Site and dist packages will be under location:
/tmp/my_python_lib/lib64/python3.6, /tmp/my_python_lib/lib/python3.6
Zip these packages along with your script file and upload to S3 which can be accessed in lambda.Instead of zipping the root folder you have to select all files and zip it or send to compressed folder.
Site 和 dist 包将位于以下位置:
/tmp/my_python_lib/lib64/python3.6, /tmp/my_python_lib/lib/python3.6
将这些包与您的脚本文件一起压缩并上传到可以在 lambda 中访问的 S3。而不是压缩根文件夹,您必须选择所有文件并将其压缩或发送到压缩文件夹。
Additional tips:
附加提示:
If you want to install all packages under one directory, you can use command:
pip install --upgrade --target=/tmp/my_python_lib/lib/python3.6/dist-packages pandas
如果要将所有包都安装在一个目录下,可以使用命令:
pip install --upgrade --target=/tmp/my_python_lib/lib/python3.6/dist-packages pandas
回答by karoli
As of August 2018, probably the easiest way is to start a new AWS Cloud9 environment. Then create a Lambda function inside the environment. Next run this into the Cloud9 command line:
截至 2018 年 8 月,最简单的方法可能是启动一个新的 AWS Cloud9 环境。然后在环境中创建一个 Lambda 函数。接下来在 Cloud9 命令行中运行它:
cd YourApplicationName
/venv/bin/pip install scipy -t .
/venv/bin/pip install numpy -t .
/venv/bin/pip install moviepy -t .
Now I am able to import the modules in the lambda_handler function.
现在我可以在 lambda_handler 函数中导入模块。
回答by Steinway Wu
Nov 2018. Hi friends, this post is extremely helpful for me. However, the answers so far are not very automated. I wrote a Python script and tutorial here https://gist.github.com/steinwaywhw/6a6a25d594cc07146c60af943f74c16fto automate the creation of compiled Python packages using pip
and virtualenv
on EC2. Everything is Python (Boto3), no bash script, no Web console, no awscli
.
2018 年 11 月。嗨,朋友们,这篇文章对我很有帮助。但是,到目前为止,答案还不是很自动化。我在这里写了一个 Python 脚本和教程https://gist.github.com/steinwaywhw/6a6a25d594cc07146c60af943f74c16f来使用EC2pip
和virtualenv
在 EC2 上自动创建已编译的 Python 包。一切都是 Python (Boto3),没有 bash 脚本,没有 Web 控制台,没有awscli
.
There are one other change besides automation, which I think is an improvement. I downloaded the whole Python virtual environment from EC2 preserving its folder structures, instead of merging lib
and lib64
packages all together. I never understand the intended meaning of merging those two folders. What if some packages override other packages, right? Plus, faking an official virtual environment is non-the-less a safer way to go than rolling your own.
除了自动化之外,还有其他变化,我认为这是一种改进。我从 EC2 下载了整个 Python 虚拟环境,保留了它的文件夹结构,而不是合并lib
和lib64
打包在一起。我从来不明白合并这两个文件夹的意图。如果某些包覆盖其他包怎么办,对吗?此外,伪造官方虚拟环境仍然是比自己动手更安全的方法。
For the downloaded virtual environment to work, the source code of the Lambda function adds some boilerplate code to update Python search path using sys.path
. The intended sys.path
of a Python virtual environment can be found by
为了让下载的虚拟环境正常工作,Lambda 函数的源代码添加了一些样板代码以使用sys.path
. sys.path
可以通过以下方式找到 Python 虚拟环境的意图
- On your own machine, create a virtual environment and activate it.
- Run a Python script in this virtual environment and do
print(sys.path)
afterimport sys
. You can start from there and modify as you see fit.
- 在您自己的机器上,创建一个虚拟环境并激活它。
- 在此虚拟环境中运行 Python 脚本并
print(sys.path)
在import sys
. 您可以从那里开始并根据需要进行修改。
A snippet of the boilerplate code to add for a Lambda function in order to load numpy
and other packages from my packaged virtual environment is pasted below. In my case, I loaded pandas_datareader
which relies on numpy
.
numpy
下面粘贴了为 Lambda 函数添加的样板代码片段,以便从我打包的虚拟环境中加载和其他包。就我而言,我加载pandas_datareader
了依赖于numpy
.
import os
import sys
# https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html
workdir = os.getenv('LAMBDA_TASK_ROOT')
version = f'{sys.version_info[0]}.{sys.version_info[1]}'
additionals = [f'{workdir}/venv/lib64/python{version}/site-packages',
f'{workdir}/venv/lib64/python{version}/lib-dynload',
f'{workdir}/venv/lib64/python{version}/dist-packages',
f'{workdir}/venv/lib/python{version}/dist-packages',
f'{workdir}/venv/lib/python{version}/site-packages']
sys.path = additionals + sys.path
import pandas_datareader as pdr