Python 在 amazon lambda 中使用 moviepy、scipy 和 numpy

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34749806/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 15:28:02  来源:igfitidea点击:

Using moviepy, scipy and numpy in amazon lambda

pythonamazon-web-servicesnumpyaws-lambda

提问by rouk1

I'd like to generate video using AWS Lambdafeature.

我想使用AWS Lambda功能生成视频。

I've followed instructions found hereand here.

我已按照此处此处找到的说明进行操作。

And I now have the following process to build my Lambdafunction:

我现在有以下过程来构建我的Lambda功能:

Step 1

第1步

Fire a Amazon Linux EC2instance and run this as root on it:

启动一个Amazon Linux EC2实例并以 root 身份运行它:

#! /usr/bin/env bash

# Install the SciPy stack on Amazon Linux and prepare it for AWS Lambda

yum -y update
yum -y groupinstall "Development Tools"
yum -y install blas --enablerepo=epel
yum -y install lapack --enablerepo=epel
yum -y install atlas-sse3-devel --enablerepo=epel
yum -y install Cython --enablerepo=epel
yum -y install python27
yum -y install python27-numpy.x86_64
yum -y install python27-numpy-f2py.x86_64
yum -y install python27-scipy.x86_64

/usr/local/bin/pip install --upgrade pip
mkdir -p /home/ec2-user/stack
/usr/local/bin/pip install moviepy -t /home/ec2-user/stack

cp -R /usr/lib64/python2.7/dist-packages/numpy /home/ec2-user/stack/numpy
cp -R /usr/lib64/python2.7/dist-packages/scipy /home/ec2-user/stack/scipy

tar -czvf stack.tgz /home/ec2-user/stack/*

Step 2

第2步

I scp the resulting tarball to my laptop. And then run this script to build a zip archive.

我将生成的 tarball scp 到我的笔记本电脑。然后运行此脚本以构建 zip 存档。

#! /usr/bin/env bash

mkdir tmp
rm lambda.zip
tar -xzf stack.tgz -C tmp

zip -9 lambda.zip process_movie.py
zip -r9 lambda.zip *.ttf
cd tmp/home/ec2-user/stack/
zip -r9 ../../../../lambda.zip *

process_movie.pyscript is at the moment only a test to see if the stack is ok:

process_movie.py脚本目前只是一个测试,看看堆栈是否正常:

def make_movie(event, context):
    import os
    print(os.listdir('.'))
    print(os.listdir('numpy'))
    try:
        import scipy
    except ImportError:
        print('can not import scipy')

    try:
        import numpy
    except ImportError:
        print('can not import numpy')

    try:
        import moviepy
    except ImportError:
        print('can not import moviepy')

Step 3

第 3 步

Then I upload the resulting archive to S3 to be the source of my lambdafunction. When I test the function I get the following callstack:

然后我将生成的存档上传到 S3 作为我的lambda函数的来源。当我测试该功能时,我得到以下信息callstack

START RequestId: 36c62b93-b94f-11e5-9da7-83f24fc4b7ca Version: $LATEST
['tqdm', 'imageio-1.4.egg-info', 'decorator.pyc', 'process_movie.py', 'decorator-4.0.6.dist-info', 'imageio', 'moviepy', 'tqdm-3.4.0.dist-info', 'scipy', 'numpy', 'OpenSans-Regular.ttf', 'decorator.py', 'moviepy-0.2.2.11.egg-info']
['add_newdocs.pyo', 'numarray', '__init__.py', '__config__.pyc', '_import_tools.py', 'setup.pyo', '_import_tools.pyc', 'doc', 'setupscons.py', '__init__.pyc', 'setup.py', 'version.py', 'add_newdocs.py', 'random', 'dual.pyo', 'version.pyo', 'ctypeslib.pyc', 'version.pyc', 'testing', 'dual.pyc', 'polynomial', '__config__.pyo', 'f2py', 'core', 'linalg', 'distutils', 'matlib.pyo', 'tests', 'matlib.pyc', 'setupscons.pyc', 'setup.pyc', 'ctypeslib.py', 'numpy', '__config__.py', 'matrixlib', 'dual.py', 'lib', 'ma', '_import_tools.pyo', 'ctypeslib.pyo', 'add_newdocs.pyc', 'fft', 'matlib.py', 'setupscons.pyo', '__init__.pyo', 'oldnumeric', 'compat']
can not import scipy
'module' object has no attribute 'core': AttributeError
Traceback (most recent call last):
  File "/var/task/process_movie.py", line 91, in make_movie
    import numpy
  File "/var/task/numpy/__init__.py", line 122, in <module>
    from numpy.__config__ import show as show_config
  File "/var/task/numpy/numpy/__init__.py", line 137, in <module>
    import add_newdocs
  File "/var/task/numpy/numpy/add_newdocs.py", line 9, in <module>
    from numpy.lib import add_newdoc
  File "/var/task/numpy/lib/__init__.py", line 13, in <module>
    from polynomial import *
  File "/var/task/numpy/lib/polynomial.py", line 11, in <module>
    import numpy.core.numeric as NX
AttributeError: 'module' object has no attribute 'core'

END RequestId: 36c62b93-b94f-11e5-9da7-83f24fc4b7ca
REPORT RequestId: 36c62b93-b94f-11e5-9da7-83f24fc4b7ca  Duration: 112.49 ms Billed Duration: 200 ms     Memory Size: 1536 MB    Max Memory Used: 14 MB

I cant understand why python does not found the core directory that is present in the folder structure.

我不明白为什么 python 找不到文件夹结构中存在的核心目录。

EDIT:

编辑:

Following @jarmod advice I've reduced the lambdafunction to:

按照@jarmod 的建议,我将lambda功能简化为:

def make_movie(event, context):
    print('running make movie')
    import numpy

I now have the following error:

我现在有以下错误:

START RequestId: 6abd7ef6-b9de-11e5-8aee-918ac0a06113 Version: $LATEST
running make movie
Error importing numpy: you should not try to import numpy from
        its source directory; please exit the numpy source tree, and relaunch
        your python intepreter from there.: ImportError
Traceback (most recent call last):
  File "/var/task/process_movie.py", line 3, in make_movie
    import numpy
  File "/var/task/numpy/__init__.py", line 127, in <module>
    raise ImportError(msg)
ImportError: Error importing numpy: you should not try to import numpy from
        its source directory; please exit the numpy source tree, and relaunch
        your python intepreter from there.

END RequestId: 6abd7ef6-b9de-11e5-8aee-918ac0a06113
REPORT RequestId: 6abd7ef6-b9de-11e5-8aee-918ac0a06113  Duration: 105.95 ms Billed Duration: 200 ms     Memory Size: 1536 MB    Max Memory Used: 14 MB

采纳答案by rouk1

With the help of all posts in this thread here is a solution for the records:

在此线程中所有帖子的帮助下,这里是记录的解决方案:

To get this to work you'll need to:

要使其工作,您需要:

  1. start a EC2instance with at least 2GO RAM (to be able to compile NumPy& SciPy)

  2. Install the needed dependencies

    sudo yum -y update
    sudo yum -y upgrade
    sudo yum -y groupinstall "Development Tools"
    sudo yum -y install blas --enablerepo=epel
    sudo yum -y install lapack --enablerepo=epel
    sudo yum -y install Cython --enablerepo=epel
    sudo yum install python27-devel python27-pip gcc
    virtualenv ~/env
    source ~/env/bin/activate
    pip install scipy
    pip install numpy
    pip install moviepy
    
  3. Copy to your locale machine all the content of the directories (except _markerlib, pip*, pkg_resources, setuptools* and easyinstall*) in a stackfolder:

    • home/ec2-user/env/lib/python2.7/dist-packages
    • home/ec2-user/env/lib64/python2.7/dist-packages
  4. get all required shared libraries from you EC2instance:

    • libatlas.so.3
    • libf77blas.so.3
    • liblapack.so.3
    • libptf77blas.so.3
    • libcblas.so.3
    • libgfortran.so.3
    • libptcblas.so.3
    • libquadmath.so.0
  5. Put them in a libsubfolder of the stackfolder

  6. imageiois a dependency of moviepy, you'll need to download some binary version of its dependencies: libfreeimageand of ffmpeg; they can be found here. Put them at the root of your stack folder and rename libfreeimage-3.16.0-linux64.soto libfreeimage.so

  7. You should now have a stackfolder containing:

    • all python dependencies at root
    • all shared libraries in a libsubfolder
    • ffmpegbinary at root
    • libfreeimage.soat root
  8. Zip this folder: zip -r9 stack.zip . -x ".*" -x "*/.*"

  9. Use the following lambda_function.pyas an entry point for your lambda

    from __future__ import print_function
    
    import os
    import subprocess
    
    SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
    LIB_DIR = os.path.join(SCRIPT_DIR, 'lib')
    FFMPEG_BINARY = os.path.join(SCRIPT_DIR, 'ffmpeg')
    
    
    def lambda_handler(event, context):
        command = 'LD_LIBRARY_PATH={} IMAGEIO_FFMPEG_EXE={} python movie_maker.py'.format(
            LIB_DIR,
            FFMPEG_BINARY,
        )
        try:
            output = subprocess.check_output(command, shell=True)
            print(output)
        except subprocess.CalledProcessError as e:
            print(e.output)
    
  10. write a movie_maker.pyscript that depends on moviepy, numpy, ...

  11. add those to script to your stack.zip file zip -r9 lambda.zip *.py

  12. upload the zip to S3and use it as a source for your lambda

  1. 启动一个EC2至少有 2GO RAM的实例(以便能够编译NumPy& SciPy

  2. 安装所需的依赖项

    sudo yum -y update
    sudo yum -y upgrade
    sudo yum -y groupinstall "Development Tools"
    sudo yum -y install blas --enablerepo=epel
    sudo yum -y install lapack --enablerepo=epel
    sudo yum -y install Cython --enablerepo=epel
    sudo yum install python27-devel python27-pip gcc
    virtualenv ~/env
    source ~/env/bin/activate
    pip install scipy
    pip install numpy
    pip install moviepy
    
  3. stack文件夹中的所有目录内容(_markerlib、pip*、pkg_resources、setuptools* 和 easyinstall* 除外)复制到您的语言环境机器:

    • home/ec2-user/env/lib/python2.7/dist-packages
    • home/ec2-user/env/lib64/python2.7/dist-packages
  4. 从您的EC2实例中获取所有必需的共享库:

    • libatlas.so.3
    • libf77blas.so.3
    • liblapack.so.3
    • libptf77blas.so.3
    • libcblas.so.3
    • libgfortran.so.3
    • libptcblas.so.3
    • libquadmath.so.0
  5. 将它们放在lib文件夹的子stack文件夹中

  6. imageio是 的依赖项moviepy,您需要下载其依赖项的一些二进制版本:libfreeimageffmpeg; 他们可以在这里找到。将它们放在堆栈文件夹的根目录并重命名libfreeimage-3.16.0-linux64.solibfreeimage.so

  7. 您现在应该有一个stack包含以下内容的文件夹:

    • 根目录下的所有 python 依赖项
    • lib子文件夹中的所有共享库
    • ffmpeg二进制根
    • libfreeimage.so从根本上
  8. 压缩这个文件夹: zip -r9 stack.zip . -x ".*" -x "*/.*"

  9. 使用以下lambda_function.py作为您的入口点lambda

    from __future__ import print_function
    
    import os
    import subprocess
    
    SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
    LIB_DIR = os.path.join(SCRIPT_DIR, 'lib')
    FFMPEG_BINARY = os.path.join(SCRIPT_DIR, 'ffmpeg')
    
    
    def lambda_handler(event, context):
        command = 'LD_LIBRARY_PATH={} IMAGEIO_FFMPEG_EXE={} python movie_maker.py'.format(
            LIB_DIR,
            FFMPEG_BINARY,
        )
        try:
            output = subprocess.check_output(command, shell=True)
            print(output)
        except subprocess.CalledProcessError as e:
            print(e.output)
    
  10. 编写一个movie_maker.py依赖于moviepy, numpy, ...的脚本

  11. 将这些添加到您的 stack.zip 文件中的脚本中 zip -r9 lambda.zip *.py

  12. 将 zip 上传到S3并将其用作您的源lambda

You can also download the stack.ziphere.

您也可以在stack.zip此处下载.

回答by Attila Tanyi

I was also following your first link and managed to import numpyand pandasin a Lambda function this way (on Windows):

我也在关注你的第一个链接,并设法以这种方式(在 Windows 上)在 Lambda 函数中导入numpypandas

  1. Started a (free-tier) t2.microEC2 instancewith 64-bit Amazon Linux AMI 2015.09.1 and used Putty to SSH in.
  2. Tried the same commandsyou used and the one recommended by the Amazon article:

    sudo yum -y update
    sudo yum -y upgrade
    sudo yum -y groupinstall "Development Tools"
    sudo yum -y install blas --enablerepo=epel
    sudo yum -y install lapack --enablerepo=epel
    sudo yum -y install Cython --enablerepo=epel
    sudo yum install python27-devel python27-pip gcc
    
  3. Created the virtual environment:

    virtualenv ~/env
    source ~/env/bin/activate
    
  4. Installed the packages:

    sudo ~/env/bin/pip2.7 install numpy
    sudo ~/env/bin/pip2.7 install pandas
    
  5. Then, using WinSCP, I logged in and downloadedeverything (except _markerlib, pip*, pkg_resources, setuptools* and easyinstall*) from /home/ec2-user/env/lib/python2.7/dist-packages, and everything from /home/ec2-user/env/lib64/python2.7/site-packagesfrom the EC2 instance.

  6. I put all these folders and files into one zip, along with the .py file containing the Lambda function. illustration of all files copied

  7. Because this .zip is larger than 10 MB, I created an S3 bucketto store the file. I copied the link of the file from there and pasted at "Upload a .ZIP from Amazon S3" at the Lambda function.

  8. The EC2 instance can be shut down, it's not needed any more.

  1. 使用 64 位 Amazon Linux AMI 2015.09.1启动(免费层)t2.micro EC2 实例并使用 Putty 进行 SSH 连接。
  2. 尝试了您使用的相同命令以及亚马逊文章推荐的命令

    sudo yum -y update
    sudo yum -y upgrade
    sudo yum -y groupinstall "Development Tools"
    sudo yum -y install blas --enablerepo=epel
    sudo yum -y install lapack --enablerepo=epel
    sudo yum -y install Cython --enablerepo=epel
    sudo yum install python27-devel python27-pip gcc
    
  3. 创建虚拟环境

    virtualenv ~/env
    source ~/env/bin/activate
    
  4. 安装了软件包

    sudo ~/env/bin/pip2.7 install numpy
    sudo ~/env/bin/pip2.7 install pandas
    
  5. 然后,使用 WinSCP,我登录并下载了所有内容(除了 _markerlib、pip*、pkg_resources、setuptools* 和 easyinstall*)/home/ec2-user/env/lib/python2.7/dist-packages,以及/home/ec2-user/env/lib64/python2.7/site-packages来自 EC2 实例的所有内容。

  6. 我将所有这些文件夹和文件以及包含 Lambda 函数的 .py 文件放在一个zip文件中。 复制的所有文件的插图

  7. 由于此 .zip 大于 10 MB,因此我创建了一个S3 存储桶来存储该文件。我从那里复制了文件的链接并粘贴到 Lambda 函数的“从 Amazon S3 上传 .ZIP”。

  8. EC2 实例可以关闭,不再需要。

With this, I could import numpy and pandas. I'm not familiar with moviepy, but scipy might already be tricky as Lambda has a limitfor unzipped deployment package size at 262 144 000 bytes. I'm afraid numpy and scipy together are already over that.

有了这个,我可以导入 numpy 和 pandas。我不熟悉moviepy,但SciPy的可能已经是棘手,因为拉姆达具有限制在262 144 000个字节的解压缩部署包大小。恐怕 numpy 和 scipy 在一起已经结束了。

回答by Vito Limandibhrata

The posts here help me to find a way to statically compile NumPy with libraries files that can be included in the AWS Lambda Deployment package. This solution does not depend on LD_LIBRARY_PATH value as in @rouk1 solution.

此处的帖子帮助我找到了一种使用可包含在 AWS Lambda 部署包中的库文件静态编译 NumPy 的方法。此解决方案不依赖于@rouk1 解决方案中的 LD_LIBRARY_PATH 值。

The compiled NumPy library can be downloaded from https://github.com/vitolimandibhrata/aws-lambda-numpy

编译后的 NumPy 库可以从https://github.com/vitolimandibhrata/aws-lambda-numpy下载

Here are the steps to custom compile NumPy

以下是自定义编译 NumPy 的步骤

Instructions on compiling this package from scratch

从头开始编译这个包的说明

Prepare a fresh AWS EC instance with AWS Linux.

使用 AWS Linux 准备一个全新的 AWS EC 实例。

Install compiler dependencies

安装编译器依赖

sudo yum -y install python-devel
sudo yum -y install gcc-c++
sudo yum -y install gcc-gfortran
sudo yum -y install libgfortran

Install NumPy dependencies

安装 NumPy 依赖项

sudo yum -y install blas
sudo yum -y install lapack
sudo yum -y install atlas-sse3-devel

Create /var/task/lib to contain the runtime libraries

创建 /var/task/lib 以包含运行时库

mkdir -p /var/task/lib

/var/task is the root directory where your code will reside in AWS Lambda thus we need to statically link the required library files in a well known folder which in this case /var/task/lib

/var/task 是您的代码将驻留在 AWS Lambda 中的根目录,因此我们需要将所需的库文件静态链接到一个众所周知的文件夹中,在本例中为 /var/task/lib

Copy the following library files to the /var/task/lib

将以下库文件复制到 /var/task/lib

cp /usr/lib64/atlas-sse3/liblapack.so.3 /var/task/lib/.
cp /usr/lib64/atlas-sse3/libptf77blas.so.3 /var/task/lib/.
cp /usr/lib64/atlas-sse3/libf77blas.so.3 /var/task/lib/.
cp /usr/lib64/atlas-sse3/libptcblas.so.3 /var/task/lib/.
cp /usr/lib64/atlas-sse3/libcblas.so.3 /var/task/lib/.
cp /usr/lib64/atlas-sse3/libatlas.so.3 /var/task/lib/.
cp /usr/lib64/atlas-sse3/libptf77blas.so.3 /var/task/lib/.
cp /usr/lib64/libgfortran.so.3 /var/task/lib/.
cp /usr/lib64/libquadmath.so.0 /var/task/lib/.

Get the latest numpy source code from http://sourceforge.net/projects/numpy/files/NumPy/

http://sourceforge.net/projects/numpy/files/NumPy/获取最新的 numpy 源代码

Go to the numpy source code folder e.g numpy-1.10.4 Create a site.cfg file with the following entries

转到 numpy 源代码文件夹,例如 numpy-1.10.4 使用以下条目创建一个 site.cfg 文件

[atlas]
libraries=lapack,f77blas,cblas,atlas
search_static_first=true
runtime_library_dirs = /var/task/lib
extra_link_args = -lgfortran -lquadmath

-lgfortran -lquadmath flags are required to statically link gfortran and quadmath libraries with files defined in runtime_library_dirs

-lgfortran -lquadmath 标志需要静态链接 gfortran 和 quadmath 库与 runtime_library_dirs 中定义的文件

Build NumPy

构建 NumPy

python setup.py build

Install NumPy

安装 NumPy

python setup.py install

Check whether the libraries are linked to the files in /var/task/lib

检查库是否链接到 /var/task/lib 中的文件

ldd $PYTHON_HOME/lib64/python2.7/site-packages/numpy/linalg/lapack_lite.so

You should see

你应该看到

linux-vdso.so.1 =>  (0x00007ffe0dd2d000)
liblapack.so.3 => /var/task/lib/liblapack.so.3 (0x00007ffad6be5000)
libptf77blas.so.3 => /var/task/lib/libptf77blas.so.3 (0x00007ffad69c7000)
libptcblas.so.3 => /var/task/lib/libptcblas.so.3 (0x00007ffad67a7000)
libatlas.so.3 => /var/task/lib/libatlas.so.3 (0x00007ffad6174000)
libf77blas.so.3 => /var/task/lib/libf77blas.so.3 (0x00007ffad5f56000)
libcblas.so.3 => /var/task/lib/libcblas.so.3 (0x00007ffad5d36000)
libpython2.7.so.1.0 => /usr/lib64/libpython2.7.so.1.0 (0x00007ffad596d000)
libgfortran.so.3 => /var/task/lib/libgfortran.so.3 (0x00007ffad5654000)
libm.so.6 => /lib64/libm.so.6 (0x00007ffad5352000)
libquadmath.so.0 => /var/task/lib/libquadmath.so.0 (0x00007ffad5117000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007ffad4f00000)
libc.so.6 => /lib64/libc.so.6 (0x00007ffad4b3e000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ffad4922000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007ffad471d000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007ffad451a000)
/lib64/ld-linux-x86-64.so.2 (0x000055cfc3ab8000)

回答by sangheestyle

I like @Vito Limandibhrata's answer but I think it's not enough to build numpy with runtime_library_dirs in numpy==1.11.1. If anybody think site-cfg is ignored, do the following:

我喜欢@Vito Limandibhrata 的回答,但我认为在 numpy==1.11.1 中使用 runtime_library_dirs 构建 numpy 是不够的。如果有人认为 site-cfg 被忽略,请执行以下操作:

cp /usr/lib64/atlas-sse3/*.a /var/task/lib/

*.a files under atlas-sse3 are needed to build numpy. Also, you might need to run the following:

需要 atlas-sse3 下的 *.a 文件来构建 numpy。此外,您可能需要运行以下命令:

python setup.py config

to check numpy configuration. If it requires something more, you will see the following message:

检查 numpy 配置。如果需要更多内容,您将看到以下消息:

atlas_threads_info:
Setting PTATLAS=ATLAS   libraries ptf77blas,ptcblas,atlas not found in /root/Envs/skl/lib
    libraries lapack_atlas not found in /root/Envs/skl/lib
    libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib64   
    libraries lapack_atlas not found in /usr/local/lib64
    libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib         
    libraries lapack_atlas not found in /usr/local/lib
    libraries lapack_atlas not found in /usr/lib64/atlas-sse3
<class 'numpy.distutils.system_info.atlas_threads_info'>
Setting PTATLAS=ATLAS
Setting PTATLAS=ATLAS
Setting PTATLAS=ATLAS
Setting PTATLAS=ATLAS
    libraries lapack not found in ['/var/task/lib']
Runtime library lapack was not found. Ignoring
    libraries f77blas not found in ['/var/task/lib']
Runtime library f77blas was not found. Ignoring
    libraries cblas not found in ['/var/task/lib']
Runtime library cblas was not found. Ignoring
    libraries atlas not found in ['/var/task/lib']
Runtime library atlas was not found. Ignoring
    FOUND:
        extra_link_args = ['-lgfortran -lquadmath']
        define_macros = [('NO_ATLAS_INFO', -1)]
        language = f77
        libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas', 'lapack', 'f77blas', 'cblas', 'atlas']
        library_dirs = ['/usr/lib64/atlas-sse3']
        include_dirs = ['/usr/include']

then site-cfg is going to be ignored.

那么 site-cfg 将被忽略。

Tip: If pip is used to build numpy with runtime_library_dirs, you would better create ~/.numpy-site.cfgand add the following:

提示:如果使用 pip 使用 runtime_library_dirs 构建 numpy,则最好创建~/.numpy-site.cfg并添加以下内容:

[atlas]
libraries = lapack,f77blas,cblas,atlas
search_static_first = true
runtime_library_dirs = /var/task/lib
extra_link_args = -lgfortran -lquadmath

then numpy recognizes .numpy-site.cfg file. It's quite simple and easy way.

然后 numpy 识别 .numpy-site.cfg 文件。这是非常简单和容易的方法。

回答by johncip

Another, very simple method that's possible these days is to build using the awesome docker containers that LambCI made to mimic Lambda: https://github.com/lambci/docker-lambda

现在可能的另一种非常简单的方法是使用 LambCI 模仿 Lambda 制作的很棒的 docker 容器来构建:https: //github.com/lambci/docker-lambda

The lambci/lambda:buildcontainer resembles AWS Lambda with the addition of a mostly-complete build environment. To start a shell session in it:

lambci/lambda:build容器类似于 AWS Lambda,但增加了一个基本完整的构建环境。要在其中启动 shell 会话:

docker run -v "$PWD":/var/task -it lambci/lambda:build bash

Inside the session:

会议内部:

export share=/var/task
easy_install pip
pip install -t $share numpy

Or, with virtualenv:

或者,使用 virtualenv:

export share=/var/task
export PS1="[\u@\h:\w]$ " # required by virtualenv
easy_install pip
pip install virtualenv
# ... make the venv, install numpy, and copy it to $share

Later on you can use the main lambci/lambda container to test your build.

稍后您可以使用主 lambci/lambda 容器来测试您的构建。

回答by wrwrwr

As of 2017, NumPy and SciPy have wheels that work on Lambda (the packages include precompiled libgfortranand libopenblas). As far as I know, MoviePy is a pure Python module, so basically you could do:

截至 2017 年,NumPy 和 SciPy 具有适用于 Lambda 的轮子(包包括预编译libgfortranlibopenblas)。据我所知,MoviePy 是一个纯 Python 模块,所以基本上你可以这样做:

pip2 install -t lambda moviepy scipy

Then copy your handler into the lambdadirectory and zip it. Except, that you'll most likely exceed the 50/250 MB size limits. There are a couple of things that can help:

然后将您的处理程序复制到lambda目录中并压缩它。除此之外,您很可能会超过 50/250 MB 的大小限制。有几件事可以提供帮助:

  • remove .pycs, docs, tests and other unnecessary parts;
  • leave a single copy of common libraries of NumPy and SciPy;
  • strip libraries of inessential pieces, such as debugging symbols;
  • compress the archive using higher settings.
  • 删除 .pycs、文档、测试和其他不必要的部分;
  • 保留一份 NumPy 和 SciPy 公共库的副本;
  • 剥离无关紧要的库,例如调试符号;
  • 使用更高的设置压缩存档。

Here's an example scriptthat automates the above points.

这是一个自动执行上述要点的示例脚本

回答by Jay Carroll

I can confirm that the steps posted by @attila-tanyi work correctly under Amazon Linux. I would only add that there is no need to use an EC2, as there is an Amazon Linux docker container available from the default repository.

我可以确认@attila-tanyi 发布的步骤在 Amazon Linux 下正常工作。我只想补充一点,不需要使用 EC2,因为默认存储库中有一个 Amazon Linux docker 容器。

docker pull amazonlinux && docker run -it amazonlinux
# Follow @attila-tanyi steps
# Note - sudo is not necessary here

I use the Dockerfile embedded in my application to build and deploy to Lambda.

我使用嵌入在我的应用程序中的 Dockerfile 来构建和部署到 Lambda。

回答by Abhishek Gaur

As of 2018, Steps to install external modules in Python3 on AWS EC2:

截至 2018 年,在 AWS EC2 上的 Python3 中安装外部模块的步骤:

  1. Launch EC2 on Amazon Linux AMI 201709.

  2. ssh with putty using private and public key and become super user.

  3. Install Python 3 and create virtual env, then make it default

    yum install python36 python36-virtualenv python36-pip
    
    virtualenv -p python3.6 /tmp/my_python_lib
    
    source /tmp/my_python_lib/bin/activate
    
    which python --to check which version s installed
    
    pip3 install  numpy
    
  4. Copy files under site packages and dist packages into your local machhine using winscp.

    To find actual location use grep commands ---

      grep -r dist-packages *. 
    
  1. 在 Amazon Linux AMI 201709 上启动 EC2。

  2. ssh with putty 使用私钥和公钥并成为超级用户。

  3. 安装 Python 3 并创建虚拟环境,然后将其设为默认值

    yum install python36 python36-virtualenv python36-pip
    
    virtualenv -p python3.6 /tmp/my_python_lib
    
    source /tmp/my_python_lib/bin/activate
    
    which python --to check which version s installed
    
    pip3 install  numpy
    
  4. 使用 winscp 将站点包和 dist 包下的文件复制到本地机器中。

    要查找实际位置,请使用 grep 命令 ---

      grep -r dist-packages *. 
    

These packages could be inside both lib and lib64.

这些包可以在 lib 和 lib64 中。

  1. Site and dist packages will be under location:

    /tmp/my_python_lib/lib64/python3.6,
    /tmp/my_python_lib/lib/python3.6
    
  2. Zip these packages along with your script file and upload to S3 which can be accessed in lambda.Instead of zipping the root folder you have to select all files and zip it or send to compressed folder.

  1. Site 和 dist 包将位于以下位置:

    /tmp/my_python_lib/lib64/python3.6,
    /tmp/my_python_lib/lib/python3.6
    
  2. 将这些包与您的脚本文件一起压缩并上传到可以在 lambda 中访问的 S3。而不是压缩根文件夹,您必须选择所有文件并将其压缩或发送到压缩文件夹。

Additional tips:

附加提示:

  1. If you want to install all packages under one directory, you can use command:

     pip install --upgrade --target=/tmp/my_python_lib/lib/python3.6/dist-packages pandas
    
  1. 如果要将所有包都安装在一个目录下,可以使用命令:

     pip install --upgrade --target=/tmp/my_python_lib/lib/python3.6/dist-packages pandas
    

回答by karoli

As of August 2018, probably the easiest way is to start a new AWS Cloud9 environment. Then create a Lambda function inside the environment. Next run this into the Cloud9 command line:

截至 2018 年 8 月,最简单的方法可能是启动一个新的 AWS Cloud9 环境。然后在环境中创建一个 Lambda 函数。接下来在 Cloud9 命令行中运行它:

    cd YourApplicationName
    /venv/bin/pip install scipy -t .
    /venv/bin/pip install numpy -t .
    /venv/bin/pip install moviepy -t .

Now I am able to import the modules in the lambda_handler function.

现在我可以在 lambda_handler 函数中导入模块。

回答by Steinway Wu

Nov 2018. Hi friends, this post is extremely helpful for me. However, the answers so far are not very automated. I wrote a Python script and tutorial here https://gist.github.com/steinwaywhw/6a6a25d594cc07146c60af943f74c16fto automate the creation of compiled Python packages using pipand virtualenvon EC2. Everything is Python (Boto3), no bash script, no Web console, no awscli.

2018 年 11 月。嗨,朋友们,这篇文章对我很有帮助。但是,到目前为止,答案还不是很自动化。我在这里写了一个 Python 脚本和教程https://gist.github.com/steinwaywhw/6a6a25d594cc07146c60af943f74c16f来使用EC2pipvirtualenv在 EC2 上自动创建已编译的 Python 包。一切都是 Python (Boto3),没有 bash 脚本,没有 Web 控制台,没有awscli.

There are one other change besides automation, which I think is an improvement. I downloaded the whole Python virtual environment from EC2 preserving its folder structures, instead of merging liband lib64packages all together. I never understand the intended meaning of merging those two folders. What if some packages override other packages, right? Plus, faking an official virtual environment is non-the-less a safer way to go than rolling your own.

除了自动化之外,还有其他变化,我认为这是一种改进。我从 EC2 下载了整个 Python 虚拟环境,保留了它的文件夹结构,而不是合并liblib64打包在一起。我从来不明白合并这两个文件夹的意图。如果某些包覆盖其他包怎么办,对吗?此外,伪造官方虚拟环境仍然是比自己动手更安全的方法。

For the downloaded virtual environment to work, the source code of the Lambda function adds some boilerplate code to update Python search path using sys.path. The intended sys.pathof a Python virtual environment can be found by

为了让下载的虚拟环境正常工作,Lambda 函数的源代码添加了一些样板代码以使用sys.path. sys.path可以通过以下方式找到 Python 虚拟环境的意图

  • On your own machine, create a virtual environment and activate it.
  • Run a Python script in this virtual environment and do print(sys.path)after import sys. You can start from there and modify as you see fit.
  • 在您自己的机器上,创建一个虚拟环境并激活它。
  • 在此虚拟环境中运行 Python 脚本并print(sys.path)import sys. 您可以从那里开始并根据需要进行修改。

A snippet of the boilerplate code to add for a Lambda function in order to load numpyand other packages from my packaged virtual environment is pasted below. In my case, I loaded pandas_datareaderwhich relies on numpy.

numpy下面粘贴了为 Lambda 函数添​​加的样板代码片段,以便从我打包的虚拟环境中加载和其他包。就我而言,我加载pandas_datareader了依赖于numpy.

import os
import sys 

# https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html
workdir = os.getenv('LAMBDA_TASK_ROOT')
version = f'{sys.version_info[0]}.{sys.version_info[1]}'
additionals = [f'{workdir}/venv/lib64/python{version}/site-packages',
               f'{workdir}/venv/lib64/python{version}/lib-dynload',
               f'{workdir}/venv/lib64/python{version}/dist-packages',
               f'{workdir}/venv/lib/python{version}/dist-packages',
               f'{workdir}/venv/lib/python{version}/site-packages']
sys.path = additionals + sys.path

import pandas_datareader as pdr