提高 Python 模块导入的速度

Question

提问by RAAC

The question of how to speed up importing of Python modules has been asked previously (Speeding up the python "import" loaderand Python -- Speed Up Imports?) but without specific examples and has not yielded accepted solutions. I will therefore take up the issue again here, but this time with a specific example.

之前已经问过如何加速 Python 模块的导入的问题（加速 python “导入”加载程序和Python -- 加速导入？），但没有具体示例，也没有产生公认的解决方案。因此，我将在这里再次讨论这个问题，但这次是一个具体的例子。

I have a Python script that loads a 3-D image stack from disk, smooths it, and displays it as a movie. I call this script from the system command prompt when I want to quickly view my data. I'm OK with the 700 ms it takes to smooth the data as this is comparable to MATLAB. However, it takes an additional 650 ms to import the modules. So from the user's perspective the Python code runs at half the speed.

我有一个 Python 脚本，它从磁盘加载 3-D 图像堆栈，对其进行平滑处理，然后将其显示为电影。当我想快速查看我的数据时，我从系统命令提示符调用这个脚本。我对平滑数据所需的 700 毫秒没问题，因为这与 MATLAB 相当。但是，导入模块需要额外的 650 毫秒。所以从用户的角度来看，Python 代码的运行速度只有一半。

This is the series of modules I'm importing:

这是我要导入的一系列模块：

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import scipy.ndimage
import scipy.signal
import sys
import os

Of course, not all modules are equally slow to import. The chief culprits are:

当然，并非所有模块的导入速度都一样慢。罪魁祸首是：

matplotlib.pyplot   [300ms]
numpy               [110ms]
scipy.signal        [200ms]

I have experimented with using from, but this isn't any faster. Since Matplotlib is the main culprit and it's got a reputation for slow screen updates, I looked for alternatives. One is PyQtGraph, but that takes 550 ms to import.

我已经尝试过使用from，但这并没有更快。由于 Matplotlib 是罪魁祸首，并且因屏幕更新缓慢而闻名，因此我寻找了替代方案。一种是 PyQtGraph，但导入需要 550 毫秒。

I am aware of one obvious solution, which is to call my function from an interactive Python session rather than the system command prompt. This is fine but it's too MATLAB-like, I'd prefer the elegance of having my function available from the system prompt.

我知道一个明显的解决方案，即从交互式 Python 会话而不是系统命令提示符调用我的函数。这很好，但它太像 MATLAB，我更喜欢从系统提示中获得我的函数的优雅。

I'm new to Python and I'm not sure how to proceed at this point. Since I'm new, I'd appreciate links on how to implement proposed solutions. Ideally, I'm looking for a simple solution (aren't we all!) because the code needs to be portable between multiple Mac and Linux machines.

我是 Python 的新手，现在我不确定如何进行。由于我是新手，我很感激有关如何实施建议解决方案的链接。理想情况下，我正在寻找一个简单的解决方案（我们不是所有人吗！），因为代码需要在多台 Mac 和 Linux 机器之间移植。

Answer 1

采纳答案by Andrea Zonca

you could build a simple server/client, the server running continuously making and updating the plot, and the client just communicating the next file to process.

您可以构建一个简单的服务器/客户端，服务器运行不断地制作和更新绘图，而客户端只是通信下一个要处理的文件。

I wrote a simple server/client example based on the basic example from the socketmodule docs: http://docs.python.org/2/library/socket.html#example

我根据socket模块文档中的基本示例编写了一个简单的服务器/客户端示例：http: //docs.python.org/2/library/socket.html#example

here is server.py:

这是 server.py：

# expensive imports
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import scipy.ndimage
import scipy.signal
import sys
import os

# Echo server program
import socket

HOST = ''                 # Symbolic name meaning all available interfaces
PORT = 50007              # Arbitrary non-privileged port
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((HOST, PORT))
s.listen(1)
while 1:
    conn, addr = s.accept()
    print 'Connected by', addr
    data = conn.recv(1024)
    if not data: break
    conn.sendall("PLOTTING:" + data)
    # update plot
    conn.close()

and client.py:

和客户端.py：

# Echo client program
import socket
import sys

HOST = ''    # The remote host
PORT = 50007              # The same port as used by the server
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT))
s.sendall(sys.argv[1])
data = s.recv(1024)
s.close()
print 'Received', repr(data)

you just run the server:

你只需运行服务器：

python server.py

which does the imports, then the client just sends via the socket the filename of the new file to plot:

它进行导入，然后客户端只需通过套接字发送新文件的文件名即可绘制：

python client.py mytextfile.txt

then the server updates the plot.

然后服务器更新绘图。

On my machine running your imports take 0.6 seconds, while running client.py0.03 seconds.

在我的机器上运行你的导入需要 0.6 秒，而运行client.py0.03 秒。

Answer 2

回答by ElectricWarr

1.35 seconds isn't long, but I suppose if you're used to half that for a "quick check" then perhaps it seems so.

1.35 秒并不长，但我想如果您习惯于将其减半进行“快速检查”，那么也许看起来如此。

Andrea suggests a simple client/server setup, but it seems to me that you could just as easily call a very slight modification of your script and keep it's console window open while you work:

Andrea 建议了一个简单的客户端/服务器设置，但在我看来，您可以轻松调用对脚本的非常轻微的修改，并在工作时保持控制台窗口打开：

Call the script, which does the imports then waits for input
Minimize the console window, switch to your work, whatever: *Do work*
Select the console again
Provide the script with some sort of input
Receive the results with no import overhead
Switch away from the script again while it happily awaits input

调用脚本，该脚本执行导入然后等待输入
最小化控制台窗口，切换到你的工作，无论如何：*做工作*
再次选择控制台
为脚本提供某种输入
在没有导入开销的情况下接收结果
在它愉快地等待输入时再次切换脚本

I assume your script is identical every time, ie you don't need to give it image stack location or any particular commands each time (but these are easy to do as well!).

我假设您的脚本每次都是相同的，即您不需要每次都给它图像堆栈位置或任何特定命令（但这些也很容易做到！）。

Example RAAC's_Script.py:

示例 RAAC's_Script.py：

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import scipy.ndimage
import scipy.signal
import sys
import os

print('********* RAAC\'s Script Now Running *********')

while True: # Loops forever
    # Display a message and wait for user to enter text followed by enter key.
    # In this case, we're not expecting any text at all and if there is any it's ignored
    input('Press Enter to test image stack...')

    '''
    *
    *
    **RAAC's Code Goes Here** (Make sure it's indented/inside the while loop!)
    *
    *
    '''

To end the script, close the console window or press ctrl+c.

要结束脚本，请关闭控制台窗口或按 ctrl+c。

I've made this as simple as possible, but it would require very little extra to handle things like quitting nicely, doing slightly different things based on input, etc.

我已经使这尽可能简单，但是它只需要很少的额外处理就可以很好地退出，根据输入做一些稍微不同的事情等。

Answer 3

回答by phil294

You can import your modules manually instead, using imp. See documentation here.

您可以改为手动导入模块，使用imp. 请参阅此处的文档。

For example, import numpy as npcould probably be written as

例如，import numpy as np大概可以写成

import imp
np = imp.load_module("numpy",None,"/usr/lib/python2.7/dist-packages/numpy",('','',5))

This will spare python from browsing your entire sys.pathto find the desired packages.

这将使 python 免于浏览整个文件sys.path以找到所需的包。

回答by Nico Schl?mer

Not an actual answer to the question, but a hint on how to profile the import speed with Python 3.7 and tuna(a small project of mine):

不是问题的实际答案，而是关于如何使用 Python 3.7 和金枪鱼（我的一个小项目）分析导入速度的提示：

python3.7 -X importtime -c "import scipy" 2> scipy.log
tuna scipy.log

提高 Python 模块导入的速度

提问by RAAC

采纳答案by Andrea Zonca

回答by ElectricWarr

回答by phil294

回答by Nico Schl?mer

相关推荐

最近更新

标签

提高 Python 模块导入的速度

提问by RAAC

采纳答案by Andrea Zonca

回答by ElectricWarr

回答by phil294

回答by Nico Schl?mer

相关推荐

Python 导入为未定义的全局名称

Python 图中的上标

python 3.2 UnicodeEncodeError: 'charmap' 编解码器无法对位置 9629 中的字符 '\u2013' 进行编码：字符映射到 <undefined>

Python 从具有相似索引的其他 DataFrame 的列创建一个 Pandas DataFrame

相关推荐

最近更新

标签