python 减少 Django 内存使用。低垂的果实?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/487224/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Reducing Django Memory Usage. Low hanging fruit?
提问by Andy Baker
My memory usage increases over time and restarting Django is not kind to users.
我的内存使用量随着时间的推移而增加,重新启动 Django 对用户不利。
I am unsure how to go about profiling the memory usage but some tips on how to start measuring would be useful.
我不确定如何分析内存使用情况,但有关如何开始测量的一些提示会很有用。
I have a feeling that there are some simple steps that could produce big gains. Ensuring 'debug' is set to 'False' is an obvious biggie.
我有一种感觉,有一些简单的步骤可以产生巨大的收益。确保“调试”设置为“假”是一个明显的大问题。
Can anyone suggest others? How much improvement would caching on low-traffic sites?
任何人都可以推荐其他人吗?在低流量站点上缓存会有多大改进?
In this case I'm running under Apache 2.x with mod_python. I've heard mod_wsgi is a bit leaner but it would be tricky to switch at this stage unless I know the gains would be significant.
在这种情况下,我使用 mod_python 在 Apache 2.x 下运行。我听说 mod_wsgi 有点精简,但在这个阶段切换会很棘手,除非我知道收益会很大。
Edit: Thanks for the tips so far. Any suggestions how to discover what's using up the memory? Are there any guides to Python memory profiling?
编辑:感谢您到目前为止的提示。有什么建议可以发现什么正在耗尽内存?是否有任何关于 Python 内存分析的指南?
Also as mentioned there's a few things that will make it tricky to switch to mod_wsgi so I'd like to have some idea of the gains I could expect before ploughing forwards in that direction.
同样如上所述,有一些事情会使切换到 mod_wsgi 变得棘手,所以我想在朝着那个方向前进之前对我可以预期的收益有所了解。
Edit:Carl posted a slightly more detailed reply here that is worth reading: Django Deployment: Cutting Apache's Overhead
编辑:Carl 在这里发布了一个更详细的回复,值得一读:Django 部署:削减 Apache 的开销
Edit:Graham Dumpleton's articleis the best I've found on the MPM and mod_wsgi related stuff. I am rather disappointed that no-one could provide any info on debugging the memory usage in the app itself though.
编辑:Graham Dumpleton 的文章是我在 MPM 和 mod_wsgi 相关内容中找到的最好的文章。我很失望,但没有人可以提供有关调试应用程序本身内存使用情况的任何信息。
Final Edit:Well I have been discussing this with Webfaction to see if they could assist with recompiling Apache and this is their word on the matter:
最终编辑:嗯,我一直在与 Webfaction 讨论这个问题,看看他们是否可以协助重新编译 Apache,这是他们对此事的承诺:
"I really don't think that you will get much of a benefit by switching to an MPM Worker + mod_wsgi setup. I estimate that you might be able to save around 20MB, but probably not much more than that."
“我真的认为切换到 MPM Worker + mod_wsgi 设置不会给你带来太多好处。我估计你可能可以节省大约 20MB,但可能不会多于此。”
So! This brings me back to my original question (which I am still none the wiser about). How does one go about identifying where the problems lies? It's a well known maxim that you don't optimize without testing to see where you need to optimize but there is very little in the way of tutorials on measuring Python memory usage and none at all specific to Django.
所以!这让我回到我最初的问题(我仍然不知道)。人们如何确定问题所在?这是一个众所周知的格言,即不进行测试就不会优化以查看需要优化的位置,但是关于测量 Python 内存使用情况的教程很少,而且根本没有专门针对 Django 的教程。
Thanks for everyone's assistance but I think this question is still open!
感谢大家的帮助,但我认为这个问题仍然悬而未决!
Another final edit ;-)
另一个最终编辑;-)
I asked this on the django-users list and got some veryhelpful replies
我在 django-users 列表上问过这个问题,得到了一些非常有用的答复
Honestly the last update ever!
老实说,这是有史以来的最后一次更新!
This was just released. Could be the best solution yet: Profiling Django object size and memory usage with Pympler
这是刚刚发布的。可能是最好的解决方案:Profiling Django object size and memory usage with Pympler
采纳答案by nosklo
Make sure you are not keeping global references to data. That prevents the python garbage collector from releasing the memory.
确保您没有保留对数据的全局引用。这可以防止 python 垃圾收集器释放内存。
Don't use mod_python
. It loads an interpreter inside apache. If you need to use apache, use mod_wsgi
instead. It is not tricky to switch. It is very easy. mod_wsgi
is way easier to configure for djangothan brain-dead mod_python
.
不要使用mod_python
. 它在 apache 中加载一个解释器。如果您需要使用 apache,请mod_wsgi
改用。切换并不难。这很容易。为 django 配置比 Brain-deadmod_wsgi
更容易。mod_python
If you can remove apache from your requirements, that would be even better to your memory. spawning
seems to be the new fast scalable way to run python web applications.
如果您可以从您的要求中删除 apache,那对您的记忆会更好。spawning
似乎是运行 python web 应用程序的新的快速可扩展方式。
EDIT: I don't see how switching to mod_wsgi could be "tricky". It should be a very easy task. Please elaborate on the problem you are having with the switch.
编辑:我不明白切换到 mod_wsgi 会如何“棘手”。这应该是一项非常容易的任务。请详细说明您在使用交换机时遇到的问题。
回答by Van Gale
If you are running under mod_wsgi, and presumably spawning since it is WSGI compliant, you can use Dozerto look at your memory usage.
如果您在 mod_wsgi 下运行,并且可能因为它是 WSGI 兼容而产生的,您可以使用Dozer来查看您的内存使用情况。
Under mod_wsgi just add this at the bottom of your WSGI script:
在 mod_wsgi 下,只需在 WSGI 脚本的底部添加:
from dozer import Dozer
application = Dozer(application)
Then point your browser at http://domain/_dozer/indexto see a list of all your memory allocations.
然后将浏览器指向http://domain/_dozer/index以查看所有内存分配的列表。
I'll also just add my voice of support for mod_wsgi. It makes a world of difference in terms of performance and memory usage over mod_python. Graham Dumpleton's support for mod_wsgi is outstanding, both in terms of active development and in helping people on the mailing list to optimize their installations. David Cramer at curse.comhas posted some charts (which I can't seem to find now unfortunately) showing the drastic reduction in cpu and memory usage after they switched to mod_wsgi on that high traffic site. Several of the django devs have switched. Seriously, it's a no-brainer :)
我还将添加我对 mod_wsgi 的支持之声。它在性能和内存使用方面比 mod_python 大不相同。Graham Dumpleton 对 mod_wsgi 的支持非常出色,无论是在积极开发方面还是在帮助邮件列表上的人优化他们的安装方面。Curse.com上的David Cramer发布了一些图表(不幸的是,我现在似乎找不到这些图表)显示了在那个高流量站点上切换到 mod_wsgi 后 CPU 和内存使用量的急剧减少。几个 django 开发人员已经切换。说真的,这是一个明智的做法:)
回答by Pankrat
These are the Python memory profiler solutions I'm aware of (not Django related):
这些是我知道的 Python 内存分析器解决方案(与 Django 无关):
Disclaimer: I have a stake in the latter.
免责声明:我持有后者的股份。
The individual project's documentation should give you an idea of how to use these tools to analyze memory behavior of Python applications.
单个项目的文档应该让您了解如何使用这些工具来分析 Python 应用程序的内存行为。
The following is a nice "war story" that also gives some helpful pointers:
以下是一个不错的“War故事”,也提供了一些有用的提示:
回答by zgoda
Additionally, check if you do not use any of known leakers. MySQLdb is known to leak enormous amounts of memory with Django due to bug in unicode handling. Other than that, Django Debug Toolbarmight help you to track the hogs.
此外,请检查您是否不使用任何已知的泄密者。众所周知,由于 unicode 处理中的错误,MySQLdb 会在 Django 中泄漏大量内存。除此之外,Django Debug Toolbar可能会帮助您跟踪猪。
回答by Carl Meyer
In addition to not keeping around global references to large data objects, try to avoid loading large datasets into memory at all wherever possible.
除了不要保留对大型数据对象的全局引用之外,尽可能避免将大型数据集加载到内存中。
Switch to mod_wsgi in daemon mode, and use Apache's worker mpm instead of prefork. This latter step can allow you to serve many more concurrent users with much less memory overhead.
在守护进程模式下切换到 mod_wsgi,并使用 Apache 的 worker mpm 而不是 prefork。后一步可以让您以更少的内存开销为更多的并发用户提供服务。
回答by Jason Baker
Webfaction actually has some tipsfor keeping django memory usage down.
Webfaction 实际上有一些技巧可以降低 django 的内存使用量。
The major points:
要点:
- Make sure debug is set to false (you already know that).
- Use "ServerLimit" in your apache config
- Check that no big objects are being loaded in memory
- Consider serving static content in a separate process or server.
- Use "MaxRequestsPerChild" in your apache config
- Find out and understand how much memory you're using
- 确保 debug 设置为 false(您已经知道了)。
- 在您的 apache 配置中使用“ServerLimit”
- 检查内存中是否没有加载大对象
- 考虑在单独的进程或服务器中提供静态内容。
- 在您的 apache 配置中使用“MaxRequestsPerChild”
- 找出并了解您使用了多少内存
回答by AdamKG
Another plus for mod_wsgi: set a maximum-requests
parameter in your WSGIDaemonProcess
directive and mod_wsgi will restart the daemon process every so often. There should be no visible effect for the user, other than a slow page load the first time a fresh process is hit, as it'll be loading Django and your application code into memory.
mod_wsgi 的另一个优点:maximum-requests
在您的WSGIDaemonProcess
指令中设置一个参数,mod_wsgi 将每隔一段时间重新启动守护进程。除了第一次命中新进程时页面加载缓慢之外,对用户来说应该没有明显的影响,因为它会将 Django 和您的应用程序代码加载到内存中。
But even if you dohave memory leaks, that should keep the process size from getting too large, without having to interrupt service to your users.
但是,即使您确实存在内存泄漏,也应该可以防止进程变得过大,而不必中断对用户的服务。
回答by Staale
Here is the script I use for mod_wsgi (called wsgi.py, and put in the root off my django project):
这是我用于 mod_wsgi 的脚本(称为 wsgi.py,并放在我的 django 项目的根目录中):
import os
import sys
import django.core.handlers.wsgi
from os import path
sys.stdout = open('/dev/null', 'a+')
sys.stderr = open('/dev/null', 'a+')
sys.path.append(path.join(path.dirname(__file__), '..'))
os.environ['DJANGO_SETTINGS_MODULE'] = 'myproject.settings'
application = django.core.handlers.wsgi.WSGIHandler()
Adjust myproject.settings and the path as needed. I redirect all output to /dev/null since mod_wsgi by default prevents printing. Use logging instead.
根据需要调整 myproject.settings 和路径。我将所有输出重定向到 /dev/null 因为 mod_wsgi 默认情况下会阻止打印。改用日志记录。
For apache:
对于阿帕奇:
<VirtualHost *>
ServerName myhost.com
ErrorLog /var/log/apache2/error-myhost.log
CustomLog /var/log/apache2/access-myhost.log common
DocumentRoot "/var/www"
WSGIScriptAlias / /path/to/my/wsgi.py
</VirtualHost>
Hopefully this should at least help you set up mod_wsgi so you can see if it makes a difference.
希望这至少可以帮助您设置 mod_wsgi,以便您可以查看它是否有所作为。
回答by Richard Levasseur
Caches: make sure they're being flushed. Its easy for something to land in a cache, but never be GC'd because of the cache reference.
缓存:确保它们被刷新。某些东西很容易进入缓存,但由于缓存引用而永远不会被 GC 处理。
Swig'd code: Make sure any memory management is being done correctly, its really easy to miss these in python, especially with third party libraries
Swig'd 代码:确保任何内存管理都正确完成,在 python 中很容易错过这些,尤其是第三方库
Monitoring: If you can, get data about memory usage and hits. Usually you'll see a correlation between a certain type of request and memory usage.
监控:如果可以,获取有关内存使用情况和命中的数据。通常你会看到某种类型的请求和内存使用之间的相关性。
回答by Emil Stenstr?m
We stumbled over a bug in Django with big sitemaps (10.000 items). Seems Django is trying to load them all in memory when generating the sitemap: http://code.djangoproject.com/ticket/11572- effectively kills the apache process when Google pays a visit to the site.
我们在 Django 中发现了一个带有大型站点地图(10.000 项)的错误。似乎 Django 在生成站点地图时试图将它们全部加载到内存中:http: //code.djangoproject.com/ticket/11572- 当 Google 访问该站点时,有效地终止了 apache 进程。