为 Python 项目构建 Docker 镜像时如何避免重新安装包?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25305788/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 20:01:48  来源:igfitidea点击:

How to avoid reinstalling packages when building Docker image for Python projects?

pythondocker

提问by satoru

My Dockerfile is something like

我的 Dockerfile 类似于

FROM my/base

ADD . /srv
RUN pip install -r requirements.txt
RUN python setup.py install

ENTRYPOINT ["run_server"]

Every time I build a new image, dependencies have to be reinstalled, which could be very slow in my region.

每次构建新映像时,都必须重新安装依赖项,这在我所在的地区可能会非常慢。

One way I think of to cachepackages that have been installed is to override the my/baseimage with newer images like this:

我想到cache已安装的软件包的一种方法是my/base用这样的较新图像覆盖图像:

docker build -t new_image_1 .
docker tag new_image_1 my/base

So next time I build with this Dockerfile, my/base already has some packages installed.

所以下次我用这个 Dockerfile 构建时,my/base 已经安装了一些包。

But this solution has two problems:

但是这个解决方案有两个问题:

  1. It is not always possible to override a base image
  2. The base image grow bigger and bigger as newer images are layered on it
  1. 并非总是可以覆盖基本图像
  2. 随着更新的图像在其上分层,基础图像变得越来越大

So what better solution could I use to solve this problem?

那么我可以使用什么更好的解决方案来解决这个问题呢?

EDIT##:

编辑##:

Some information about the docker on my machine:

关于我机器上的 docker 的一些信息:

?  test  docker version
Client version: 1.1.2
Client API version: 1.13
Go version (client): go1.2.1
Git commit (client): d84a070
Server version: 1.1.2
Server API version: 1.13
Go version (server): go1.2.1
Git commit (server): d84a070
?  test  docker info
Containers: 0
Images: 56
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Dirs: 56
Execution Driver: native-0.2
Kernel Version: 3.13.0-29-generic
WARNING: No swap limit support

采纳答案by nacyot

Try to build with below Dockerfile.

尝试使用以下 Dockerfile 构建。

FROM my/base

WORKDIR /srv
ADD ./requirements.txt /srv/requirements.txt
RUN pip install -r requirements.txt
ADD . /srv
RUN python setup.py install

ENTRYPOINT ["run_server"]

If there are some changes on .(your project), docker skip pip installline by using cache.

如果.(您的项目)有一些更改,dockerpip install使用缓存跳过行。

Docker only run pip installon build when you edit requirements.txt file.

Docker 仅pip install在您编辑 requirements.txt 文件时在构建时运行。



I write simple Hello, World!program.

我写简单的Hello, World!程序。

$ tree
.
├── Dockerfile
├── requirements.txt
└── run.py   

0 directories, 3 file

# Dockerfile

FROM dockerfile/python
WORKDIR /srv
ADD ./requirements.txt /srv/requirements.txt
RUN pip install -r requirements.txt
ADD . /srv
CMD python /srv/run.py

# requirements.txt
pytest==2.3.4

# run.py
print("Hello, World")

Below is output.

下面是输出。

Step 1 : WORKDIR /srv
---> Running in 22d725d22e10
---> 55768a00fd94
Removing intermediate container 22d725d22e10
Step 2 : ADD ./requirements.txt /srv/requirements.txt
---> 968a7c3a4483
Removing intermediate container 5f4e01f290fd
Step 3 : RUN pip install -r requirements.txt
---> Running in 08188205e92b
Downloading/unpacking pytest==2.3.4 (from -r requirements.txt (line 1))
  Running setup.py (path:/tmp/pip_build_root/pytest/setup.py) egg_info for package pytest
....
Cleaning up...
---> bf5c154b87c9
Removing intermediate container 08188205e92b
Step 4 : ADD . /srv
---> 3002a3a67e72
Removing intermediate container 83defd1851d0
Step 5 : CMD python /srv/run.py
---> Running in 11e69b887341
---> 5c0e7e3726d6
Removing intermediate container 11e69b887341
Successfully built 5c0e7e3726d6

I update only run.py and try to build again.

我只更新 run.py 并尝试再次构建。

# run.py
print("Hello, Python")

Below is output.

下面是输出。

Sending build context to Docker daemon  5.12 kB
Sending build context to Docker daemon 
Step 0 : FROM dockerfile/python
---> f86d6993fc7b
Step 1 : WORKDIR /srv
---> Using cache
---> 55768a00fd94
Step 2 : ADD ./requirements.txt /srv/requirements.txt
---> Using cache
---> 968a7c3a4483
Step 3 : RUN pip install -r requirements.txt
---> Using cache
---> bf5c154b87c9
Step 4 : ADD . /srv
---> 9cc7508034d6
Removing intermediate container 0d7cf71eb05e
Step 5 : CMD python /srv/run.py
---> Running in f25c21135010
---> 4ffab7bc66c7
Removing intermediate container f25c21135010
Successfully built 4ffab7bc66c7

As you can see above, docker use build cache. And I update requirements.txt this time.

如上所示,docker 使用构建缓存。这次我更新了requirements.txt。

# requirements.txt

pytest==2.3.4
ipython

Below is output.

下面是输出。

Sending build context to Docker daemon  5.12 kB
Sending build context to Docker daemon 
Step 0 : FROM dockerfile/python
---> f86d6993fc7b
Step 1 : WORKDIR /srv
---> Using cache
---> 55768a00fd94
Step 2 : ADD ./requirements.txt /srv/requirements.txt
---> b6c19f0643b5
Removing intermediate container a4d9cb37dff0
Step 3 : RUN pip install -r requirements.txt
---> Running in 4b7a85a64c33
Downloading/unpacking pytest==2.3.4 (from -r requirements.txt (line 1))
  Running setup.py (path:/tmp/pip_build_root/pytest/setup.py) egg_info for package pytest

Downloading/unpacking ipython (from -r requirements.txt (line 2))
Downloading/unpacking py>=1.4.12 (from pytest==2.3.4->-r requirements.txt (line 1))
  Running setup.py (path:/tmp/pip_build_root/py/setup.py) egg_info for package py

Installing collected packages: pytest, ipython, py
  Running setup.py install for pytest

Installing py.test script to /usr/local/bin
Installing py.test-2.7 script to /usr/local/bin
  Running setup.py install for py

Successfully installed pytest ipython py
Cleaning up...
---> 23a1af3df8ed
Removing intermediate container 4b7a85a64c33
Step 4 : ADD . /srv
---> d8ae270eca35
Removing intermediate container 7f003ebc3179
Step 5 : CMD python /srv/run.py
---> Running in 510359cf9e12
---> e42fc9121a77
Removing intermediate container 510359cf9e12
Successfully built e42fc9121a77

And docker doesn't use build cache. If it doesn't work, check your docker version.

并且 docker 不使用构建缓存。如果它不起作用,请检查您的 docker 版本。

Client version: 1.1.2
Client API version: 1.13
Go version (client): go1.2.1
Git commit (client): d84a070
Server version: 1.1.2
Server API version: 1.13
Go version (server): go1.2.1
Git commit (server): d84a070

回答by jaywhy13

I found that a better way is to just add the Python site-packages directory as a volume.

我发现更好的方法是将 Python site-packages 目录添加为卷。

services:
    web:
        build: .
        command: python manage.py runserver 0.0.0.0:8000
        volumes:
            - .:/code
            -  /usr/local/lib/python2.7/site-packages/

This way I can just pip install new libraries without having to do a full rebuild.

这样我就可以 pip 安装新库而无需进行完全重建。

EDIT: Disregard this answer, jkukul'sanswer above worked for me. My intent was to cache the site-packagesfolder. That would have looked something more like:

编辑:忽略这个答案,jkukul上面答案对我有用。我的意图是缓存site-packages文件夹。那看起来更像是:

volumes:
   - .:/code
   - ./cached-packages:/usr/local/lib/python2.7/site-packages/

Caching the download folder is alot cleaner though. That also caches the wheels, so it properly achieves the task.

不过,缓存下载文件夹要干净得多。这也会缓存轮子,因此它可以正确地完成任务。

回答by Jakub Kukul

To minimise the network activity, you could point pipto a cache directory on your host machine.

为了最大限度地减少网络活动,您可以指向pip主机上的缓存目录。

Run your docker container with your host's pip cache directory bind mounted into your container's pip cache directory. docker runcommand should look like this:

运行 docker 容器,并将主机的 pip 缓存目录绑定安装到容器的 pip 缓存目录中。docker run命令应如下所示:

docker run -v $HOME/.cache/pip-docker/:/root/.cache/pip image_1

Then in your Dockerfile install your requirements as a part of ENTRYPOINTstatement (or CMDstatement) instead of as a RUNcommand. This is important, because (as pointed out in comments) the mount is not available during image building (when RUNstatements are executed). Docker file should look like this:

然后在您的 Dockerfile 中安装您的需求作为ENTRYPOINT语句(或CMD语句)的一部分而不是RUN命令。这很重要,因为(如评论中所指出的)在映像构建期间(RUN执行语句时)挂载不可用。Docker 文件应如下所示:

FROM my/base

ADD . /srv

ENTRYPOINT ["sh", "-c", "pip install -r requirements.txt && python setup.py install && run_server"]

回答by Andy Shinn

I understand this question has some popular answers already. But there is a newer way to cache files for package managers. I think it could be a good answer in the future when BuildKit becomes more standard.

我知道这个问题已经有一些流行的答案。但是有一种更新的方法可以为包管理器缓存文件。我认为当 BuildKit 变得更加标准时,这可能是一个很好的答案。

As of Docker 18.09 there is experimental support for BuildKit. BuildKit adds support for some new features in the Dockerfile including experimental support for mounting external volumesinto RUNsteps. This allows us to create caches for things like $HOME/.cache/pip/.

从 Docker 18.09 开始,有对BuildKit 的实验性支持。BuildKit 添加了对 Dockerfile 中一些新功能的支持,包括对将外部卷安装RUN步骤中的实验性支持。这允许我们为诸如$HOME/.cache/pip/.

We'll use the following requirements.txtfile as an example:

我们将使用以下requirements.txt文件作为示例:

Click==7.0
Django==2.2.3
django-appconf==1.0.3
django-compressor==2.3
django-debug-toolbar==2.0
django-filter==2.2.0
django-reversion==3.0.4
django-rq==2.1.0
pytz==2019.1
rcssmin==1.0.6
redis==3.3.4
rjsmin==1.1.0
rq==1.1.0
six==1.12.0
sqlparse==0.3.0

A typical example Python Dockerfilemight look like:

一个典型的 Python 示例Dockerfile可能如下所示:

FROM python:3.7
WORKDIR /usr/src/app
COPY requirements.txt /usr/src/app/
RUN pip install -r requirements.txt
COPY . /usr/src/app

With BuildKit enabled using the DOCKER_BUILDKITenvironment variable we can build the uncached pipstep in about 65 seconds:

使用DOCKER_BUILDKIT环境变量启用 BuildKit 后,我们可以pip在大约 65 秒内构建未缓存的步骤:

$ export DOCKER_BUILDKIT=1
$ docker build -t test .
[+] Building 65.6s (10/10) FINISHED                                                                                                                                             
 => [internal] load .dockerignore                                                                                                                                          0.0s
 => => transferring context: 2B                                                                                                                                            0.0s
 => [internal] load build definition from Dockerfile                                                                                                                       0.0s
 => => transferring dockerfile: 120B                                                                                                                                       0.0s
 => [internal] load metadata for docker.io/library/python:3.7                                                                                                              0.5s
 => CACHED [1/4] FROM docker.io/library/python:3.7@sha256:6eaf19442c358afc24834a6b17a3728a45c129de7703d8583392a138ecbdb092                                                 0.0s
 => [internal] load build context                                                                                                                                          0.6s
 => => transferring context: 899.99kB                                                                                                                                      0.6s
 => CACHED [internal] helper image for file operations                                                                                                                     0.0s
 => [2/4] COPY requirements.txt /usr/src/app/                                                                                                                              0.5s
 => [3/4] RUN pip install -r requirements.txt                                                                                                                             61.3s
 => [4/4] COPY . /usr/src/app                                                                                                                                              1.3s
 => exporting to image                                                                                                                                                     1.2s
 => => exporting layers                                                                                                                                                    1.2s
 => => writing image sha256:d66a2720e81530029bf1c2cb98fb3aee0cffc2f4ea2aa2a0760a30fb718d7f83                                                                               0.0s
 => => naming to docker.io/library/test                                                                                                                                    0.0s

Now, let us add the experimental header and modify the RUNstep to cache the Python packages:

现在,让我们添加实验标头并修改RUN缓存 Python 包的步骤:

# syntax=docker/dockerfile:experimental

FROM python:3.7
WORKDIR /usr/src/app
COPY requirements.txt /usr/src/app/
RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt
COPY . /usr/src/app

Go ahead and do another build now. It should take the same amount of time. But this time it is caching the Python packages in our new cache mount:

现在继续进行另一个构建。它应该花费相同的时间。但是这次它在我们的新缓存挂载中缓存 Python 包:

$ docker build -t pythontest .
[+] Building 60.3s (14/14) FINISHED                                                                                                                                             
 => [internal] load build definition from Dockerfile                                                                                                                       0.0s
 => => transferring dockerfile: 120B                                                                                                                                       0.0s
 => [internal] load .dockerignore                                                                                                                                          0.0s
 => => transferring context: 2B                                                                                                                                            0.0s
 => resolve image config for docker.io/docker/dockerfile:experimental                                                                                                      0.5s
 => CACHED docker-image://docker.io/docker/dockerfile:experimental@sha256:9022e911101f01b2854c7a4b2c77f524b998891941da55208e71c0335e6e82c3                                 0.0s
 => [internal] load .dockerignore                                                                                                                                          0.0s
 => [internal] load build definition from Dockerfile                                                                                                                       0.0s
 => => transferring dockerfile: 120B                                                                                                                                       0.0s
 => [internal] load metadata for docker.io/library/python:3.7                                                                                                              0.5s
 => CACHED [1/4] FROM docker.io/library/python:3.7@sha256:6eaf19442c358afc24834a6b17a3728a45c129de7703d8583392a138ecbdb092                                                 0.0s
 => [internal] load build context                                                                                                                                          0.7s
 => => transferring context: 899.99kB                                                                                                                                      0.6s
 => CACHED [internal] helper image for file operations                                                                                                                     0.0s
 => [2/4] COPY requirements.txt /usr/src/app/                                                                                                                              0.6s
 => [3/4] RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt                                                                                  53.3s
 => [4/4] COPY . /usr/src/app                                                                                                                                              2.6s
 => exporting to image                                                                                                                                                     1.2s
 => => exporting layers                                                                                                                                                    1.2s
 => => writing image sha256:0b035548712c1c9e1c80d4a86169c5c1f9e94437e124ea09e90aea82f45c2afc                                                                               0.0s
 => => naming to docker.io/library/test                                                                                                                                    0.0s

About 60 seconds. Similar to our first build.

大约 60 秒。类似于我们的第一个构建。

Make a small change to the requirements.txt(such as adding a new line between two packages) to force a cache invalidation and run again:

requirements.txt(例如在两个包之间添加一个新行)进行一些小的更改以强制缓存失效并再次运行:

$ docker build -t pythontest .
[+] Building 15.9s (14/14) FINISHED                                                                                                                                             
 => [internal] load build definition from Dockerfile                                                                                                                       0.0s
 => => transferring dockerfile: 120B                                                                                                                                       0.0s
 => [internal] load .dockerignore                                                                                                                                          0.0s
 => => transferring context: 2B                                                                                                                                            0.0s
 => resolve image config for docker.io/docker/dockerfile:experimental                                                                                                      1.1s
 => CACHED docker-image://docker.io/docker/dockerfile:experimental@sha256:9022e911101f01b2854c7a4b2c77f524b998891941da55208e71c0335e6e82c3                                 0.0s
 => [internal] load build definition from Dockerfile                                                                                                                       0.0s
 => => transferring dockerfile: 120B                                                                                                                                       0.0s
 => [internal] load .dockerignore                                                                                                                                          0.0s
 => [internal] load metadata for docker.io/library/python:3.7                                                                                                              0.5s
 => CACHED [1/4] FROM docker.io/library/python:3.7@sha256:6eaf19442c358afc24834a6b17a3728a45c129de7703d8583392a138ecbdb092                                                 0.0s
 => CACHED [internal] helper image for file operations                                                                                                                     0.0s
 => [internal] load build context                                                                                                                                          0.7s
 => => transferring context: 899.99kB                                                                                                                                      0.7s
 => [2/4] COPY requirements.txt /usr/src/app/                                                                                                                              0.6s
 => [3/4] RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt                                                                                   8.8s
 => [4/4] COPY . /usr/src/app                                                                                                                                              2.1s
 => exporting to image                                                                                                                                                     1.1s
 => => exporting layers                                                                                                                                                    1.1s
 => => writing image sha256:fc84cd45482a70e8de48bfd6489e5421532c2dd02aaa3e1e49a290a3dfb9df7c                                                                               0.0s
 => => naming to docker.io/library/test                                                                                                                                    0.0s

Only about 16 seconds!

仅约 16 秒!

We are getting this speedup because we are no longer downloading all the Python packages. They were cached by the package manager (pipin this case) and stored in a cache volume mount. The volume mount is provided to the run step so that pipcan reuse our already downloaded packages. This happens outside any Docker layer caching.

我们获得了这种加速,因为我们不再下载所有 Python 包。它们由包管理器(pip在本例中)缓存并存储在缓存卷安装中。卷安装提供给运行步骤,以便pip可以重用我们已经下载的包。这发生在任何 Docker 层缓存之外

The gains should be much better on larger requirements.txt.

更大的收益应该会好得多requirements.txt

Notes:

笔记:

  • This is experimental Dockerfile syntax and should be treated as such. You may not want to build with this in production at the moment.
  • The BuildKit stuff doesn't work under Docker Compose or other tools that directly use the Docker API at the moment.There is now support for this in Docker Compose as of 1.25.0. See How do you enable BuildKit with docker-compose?
  • There isn't any direct interface for managed the cache at the moment. It is purged when you do a docker system prune -a.
  • 这是实验性的 Dockerfile 语法,应如此对待。您现在可能不想在生产中使用它进行构建。
  • 目前,BuildKit 内容在 Docker Compose 或其他直接使用 Docker API 的工具下不起作用。从 1.25.0 开始,Docker Compose 现在支持此功能。请参阅如何使用 docker-compose 启用 BuildKit?
  • 目前没有任何用于管理缓存的直接接口。当您执行docker system prune -a.

Hopefully, these features will make it into Docker for building and BuildKit will become the default. If / when that happens I will try to update this answer.

希望这些功能可以在 Docker 中进行构建,并且 BuildKit 将成为默认功能。如果/当发生这种情况时,我会尝试更新此答案。