在 Dockerfile 中安装 Pandas

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/50190676/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:32:04  来源:igfitidea点击:

Install pandas in a Dockerfile

pythonpandasdockerdockerfilepython-3.6

提问by ccasimiro9444

I am trying to create a Docker image. The Dockerfile is the following:

我正在尝试创建一个 Docker 映像。Dockerfile 如下:

# Use the official Python 3.6.5 image
FROM python:3.6.5-alpine3.7

# Set the working directory to /app
WORKDIR /app

# Get the 
COPY requirements.txt /app
RUN pip3 install --no-cache-dir -r requirements.txt

# Configuring access to Jupyter
RUN mkdir /notebooks
RUN jupyter notebook --no-browser --ip 0.0.0.0 --port 8888 /notebooks

The requirements.txt file is:

requirements.txt 文件是:

jupyter
numpy==1.14.3
pandas==0.23.0rc2
scipy==1.0.1
scikit-learn==0.19.1
pillow==5.1.1
matplotlib==2.2.2
seaborn==0.8.1

Running the command docker build -t standard .gives me an error when docker it trying to install pandas. The error is the following:

docker build -t standard .当docker尝试安装pandas时,运行该命令会给我一个错误。错误如下:

Collecting pandas==0.23.0rc2 (from -r requirements.txt (line 3))
  Downloading https://files.pythonhosted.org/packages/46/5c/a883712dad8484ef907a2f42992b122acf2bcecbb5c2aa751d1033908502/pandas-0.23.0rc2.tar.gz (12.5MB)
    Complete output from command python setup.py egg_info:
    /bin/sh: svnversion: not found
    /bin/sh: svnversion: not found
    non-existing path in 'numpy/distutils': 'site.cfg'
    Could not locate executable gfortran
    ... (loads of other stuff)
    Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-xb6f6a5o/pandas/
The command '/bin/sh -c pip3 install --no-cache-dir -r requirements.txt' returned a non-zero code: 1

When I try to install a lower version of pandas==0.22.0, I get this error:

当我尝试安装较低版本的 pandas==0.22.0 时,出现此错误:

Step 5/7 : RUN pip3 install --no-cache-dir -r requirements.txt
 ---> Running in 5810ea896689
Collecting jupyter (from -r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/83/df/0f5dd132200728a86190397e1ea87cd76244e42d39ec5e88efd25b2abd7e/jupyter-1.0.0-py2.py3-none-any.whl
Collecting numpy==1.14.3 (from -r requirements.txt (line 2))
  Downloading https://files.pythonhosted.org/packages/b0/2b/497c2bb7c660b2606d4a96e2035e92554429e139c6c71cdff67af66b58d2/numpy-1.14.3.zip (4.9MB)
Collecting pandas==0.22.0 (from -r requirements.txt (line 3))
  Downloading https://files.pythonhosted.org/packages/08/01/803834bc8a4e708aedebb133095a88a4dad9f45bbaf5ad777d2bea543c7e/pandas-0.22.0.tar.gz (11.3MB)
  Could not find a version that satisfies the requirement Cython (from versions: )
No matching distribution found for Cython
The command '/bin/sh -c pip3 install --no-cache-dir -r requirements.txt' returned a non-zero code: 1

I also tried to install Cyphon and setuptools before pandas, but it gave the same No matching distribution found for Cythonerror at the pip3 install pandas line.

我还尝试在 Pandas 之前安装 Cyphon 和 setuptools,但它No matching distribution found for Cython在 pip3 install pandas 行给出了同样的错误。

How could I get pandas installed.

我怎么能安装Pandas。

采纳答案by ccasimiro9444

I could create the Docker image now. There must have been some version incompatibilities between FROM python:3.6.5-alpine3.7and pandas. I changed the Python version to FROM python:3, then it worked fine (also had to downgrade the pillowversion to 5.1.0).

我现在可以创建 Docker 镜像了。FROM python:3.6.5-alpine3.7和Pandas之间一定存在一些版本不兼容。我将 Python 版本更改为FROM python:3,然后它工作正常(也不得不将pillow版本降级为5.1.0)。

回答by Aviv Sela

Alpine don't contain build tools by default. Install build tool and create symbolic link for locale:

默认情况下,Alpine 不包含构建工具。安装构建工具并为语言环境创建符号链接:

$ apk add --update curl gcc g++
$ ln -s /usr/include/locale.h /usr/include/xlocale.h
$ pip install numpy

Based on https://wired-world.com/?p=100

基于https://wired-world.com/?p=100

回答by kevayacht

I realize this question has been answered, but I have recently had a similar issue with numpy and pandas dependancies with a dockerized project. That being said, I hope that this will be of benefit to someone in the future.

我意识到这个问题已经得到解答,但我最近在 dockerized 项目中遇到了类似的 numpy 和 pandas 依赖问题。话虽如此,我希望这会对将来的某人有所帮助。

My solution:

我的解决方案:

As pointed out by Aviv Sela, Alpine does not contain build tools by default and will need to be added though the Dockerfile. Thus see below my Dockerfile with the build packages required for numpy and pandas for be successfully installed on Alpine for the container.

正如Aviv Sela所指出的,Alpine 默认不包含构建工具,需要通过 Dockerfile 添加。因此,请参阅下面我的 Dockerfile,其中包含在 Alpine 上成功安装容器所需的 numpy 和 pandas 所需的构建包。

FROM python:3.6-alpine3.7

RUN apk add --no-cache --update \
    python3 python3-dev gcc \
    gfortran musl-dev g++ \
    libffi-dev openssl-dev \
    libxml2 libxml2-dev \
    libxslt libxslt-dev \
    libjpeg-turbo-dev zlib-dev

RUN pip install --upgrade pip

ADD requirements.txt .
RUN pip install -r requirements.txt

The requirements.txt

要求.txt

numpy==1.17.1
pandas==0.25.1

回答by Rebeku

You're probably going to be better off building from a pandas image instead of base python. This will make iteration must faster and easier, because you won't ever have to reinstall pandas. I like amancevince/pandas ( https://hub.docker.com/r/amancevice/pandas/tags). There are Alpine and Debian images available for every pandas tag, although I think they may all be python 3.7 now.

从 Pandas 图像而不是基本 python 构建可能会更好。这将使迭代更快更容易,因为您永远不必重新安装Pandas。我喜欢 amancevince/pandas ( https://hub.docker.com/r/amancevice/pandas/tags)。每个 Pandas 标签都有 Alpine 和 Debian 映像,尽管我认为它们现在可能都是 python 3.7。

回答by jersey bean

Using a new version of python that is not yet supported with pandas will result in problems.

使用 Pandas 尚不支持的新版本 python 会导致问题。

I found it does not work with a development version of Python:

我发现它不适用于 Python 的开发版本:

FROM python:3.9.0a6-buster


RUN apt-get update && \
    apt-get -y install python3-pandas

COPY requirements.txt ./ 
RUN pip3 install --no-cache-dir -r 

requirements.txt:

要求.txt:

numpy==1.18
pandas

I found it DOES work with an officially released version of Python:

我发现它确实适用于正式发布的 Python 版本:

FROM python:3.8-buster