python Django + FastCGI - 随机引发 OperationalError

Question

提问by ibz

I'm running a Django application. Had it under Apache + mod_python before, and it was all OK. Switched to Lighttpd + FastCGI. Now I randomly get the following exception (neither the place nor the time where it appears seem to be predictable). Since it's random, and it appears only after switching to FastCGI, I assume it has something to do with some settings.

我正在运行一个 Django 应用程序。之前在 Apache + mod_python 下使用过，一切正常。切换到 Lighttpd + FastCGI。现在我随机得到以下异常（它出现的地点和时间似乎都不可预测）。由于它是随机的，并且只有在切换到FastCGI后才会出现，我认为它与某些设置有关。

Found a few results when googleing, but they seem to be related to setting maxrequests=1. However, I use the default, which is 0.

google 时发现了一些结果，但似乎与设置 maxrequests=1 有关。但是，我使用默认值，即 0。

Any ideas where to look for?

任何想法在哪里寻找？

PS. I'm using PostgreSQL. Might be related to that as well, since the exception appears when making a database query.

附注。我正在使用 PostgreSQL。也可能与此有关，因为在进行数据库查询时会出现异常。

 File "/usr/lib/python2.6/site-packages/django/core/handlers/base.py", line 86, in get_response
   response = callback(request, *callback_args, **callback_kwargs)

 File "/usr/lib/python2.6/site-packages/django/contrib/admin/sites.py", line 140, in root
   if not self.has_permission(request):

 File "/usr/lib/python2.6/site-packages/django/contrib/admin/sites.py", line 99, in has_permission
   return request.user.is_authenticated() and request.user.is_staff

 File "/usr/lib/python2.6/site-packages/django/contrib/auth/middleware.py", line 5, in __get__
   request._cached_user = get_user(request)

 File "/usr/lib/python2.6/site-packages/django/contrib/auth/__init__.py", line 83, in get_user
   user_id = request.session[SESSION_KEY]

 File "/usr/lib/python2.6/site-packages/django/contrib/sessions/backends/base.py", line 46, in __getitem__
   return self._session[key]

 File "/usr/lib/python2.6/site-packages/django/contrib/sessions/backends/base.py", line 172, in _get_session
   self._session_cache = self.load()

 File "/usr/lib/python2.6/site-packages/django/contrib/sessions/backends/db.py", line 16, in load
   expire_date__gt=datetime.datetime.now()

 File "/usr/lib/python2.6/site-packages/django/db/models/manager.py", line 93, in get
   return self.get_query_set().get(*args, **kwargs)

 File "/usr/lib/python2.6/site-packages/django/db/models/query.py", line 304, in get
   num = len(clone)

 File "/usr/lib/python2.6/site-packages/django/db/models/query.py", line 160, in __len__
   self._result_cache = list(self.iterator())

 File "/usr/lib/python2.6/site-packages/django/db/models/query.py", line 275, in iterator
   for row in self.query.results_iter():

 File "/usr/lib/python2.6/site-packages/django/db/models/sql/query.py", line 206, in results_iter
   for rows in self.execute_sql(MULTI):

 File "/usr/lib/python2.6/site-packages/django/db/models/sql/query.py", line 1734, in execute_sql
   cursor.execute(sql, params)

OperationalError: server closed the connection unexpectedly
       This probably means the server terminated abnormally
       before or while processing the request.

Answer 1

采纳答案by ibz

In the end I switched back to Apache + mod_python (I was having other random errors with fcgi, besides this one) and everything is good and stable now.

最后，我切换回 Apache + mod_python（我在 fcgi 中遇到了其他随机错误，除了这个错误），现在一切都很好且稳定。

The question still remains open. In case anybody has this problem in the future and solves it they can record the solution here for future reference. :)

这个问题仍然悬而未决。如果将来有人遇到这个问题并解决了它，他们可以在这里记录解决方案以备将来参考。:)

Answer 2

回答by hcalves

Possible solution: http://groups.google.com/group/django-users/browse_thread/thread/2c7421cdb9b99e48

可能的解决方案：http: //groups.google.com/group/django-users/browse_thread/thread/2c7421cdb9b99e48

Until recently I was curious to test this on Django 1.1.1. Will this exception be thrown again... surprise, there it was again. It took me some time to debug this, helpful hint was that it only shows when (pre)forking. So for those who getting randomly those exceptions, I can say... fix your code :) Ok.. seriously, there are always few ways of doing this, so let me firs explain where is a problem first. If you access database when any of your modules will import as, e.g. reading configuration from database then you will get this error. When your fastcgi-prefork application starts, first it imports all modules, and only after this forks children. If you have established db connection during import all children processes will have an exact copy of that object. This connection is being closed at the end of request phase (request_finished signal). So first child which will be called to process your request, will close this connection. But what will happen to the rest of the child processes? They will believe that they have open and presumably working connection to the db, so any db operation will cause an exception. Why this is not showing in threaded execution model? I suppose because threads are using same object and know when any other thread is closing connection. How to fix this? Best way is to fix your code... but this can be difficult sometimes. Other option, in my opinion quite clean, is to write somewhere in your application small piece of code:

直到最近我才想在 Django 1.1.1 上测试这个。会不会再次抛出这个异常......惊喜，它又出现了。我花了一些时间来调试这个，有用的提示是它只在（预）分叉时显示。因此，对于那些随机获得这些异常的人，我可以说...修复您的代码 :) 好吧...说真的，这样做的方法总是很少，所以让我先解释一下哪里出了问题。如果您在任何模块将导入为时访问数据库，例如从数据库读取配置，那么您将收到此错误。当您的 fastcgi-prefork 应用程序启动时，它首先会导入所有模块，并且只有在此之后才会对子项进行分叉。如果您在导入期间建立了 db 连接，则所有子进程都将拥有该对象的精确副本。此连接在请求阶段结束时关闭（request_finished 信号）。因此，将被调用来处理您的请求的第一个孩子将关闭此连接。但是其余的子进程会发生什么？他们会相信他们已经打开并且大概可以工作到数据库的连接，因此任何数据库操作都会导致异常。为什么这没有显示在线程执行模型中？我想是因为线程使用相同的对象并且知道任何其他线程何时关闭连接。如何解决这个问题？最好的方法是修复您的代码……但这有时可能很困难。其他选项，在我看来很干净，是在你的应用程序的某个地方写一小段代码：因此，将被调用来处理您的请求的第一个孩子将关闭此连接。但是其余的子进程会发生什么？他们会相信他们已经打开并且大概可以工作到数据库的连接，因此任何数据库操作都会导致异常。为什么这没有显示在线程执行模型中？我想是因为线程使用相同的对象并且知道任何其他线程何时关闭连接。如何解决这个问题？最好的方法是修复您的代码……但这有时可能很困难。其他选项，在我看来很干净，是在你的应用程序的某个地方写一小段代码：因此，将被调用来处理您的请求的第一个孩子将关闭此连接。但是其余的子进程会发生什么？他们会相信他们已经打开并且大概可以工作到数据库的连接，因此任何数据库操作都会导致异常。为什么这没有显示在线程执行模型中？我想是因为线程使用相同的对象并且知道任何其他线程何时关闭连接。如何解决这个问题？最好的方法是修复您的代码……但这有时可能很困难。其他选项，在我看来很干净，是在你的应用程序的某个地方写一小段代码：但是其余的子进程会发生什么？他们会相信他们已经打开并且大概可以工作到数据库的连接，因此任何数据库操作都会导致异常。为什么这没有显示在线程执行模型中？我想是因为线程使用相同的对象并且知道任何其他线程何时关闭连接。如何解决这个问题？最好的方法是修复您的代码……但这有时可能很困难。其他选项，在我看来很干净，是在你的应用程序的某个地方写一小段代码：但是其余的子进程会发生什么？他们会相信他们已经打开并且大概可以工作到数据库的连接，因此任何数据库操作都会导致异常。为什么这没有显示在线程执行模型中？我想是因为线程使用相同的对象并且知道任何其他线程何时关闭连接。如何解决这个问题？最好的方法是修复您的代码……但这有时可能很困难。其他选项，在我看来很干净，是在你的应用程序的某个地方写一小段代码：但这有时可能很困难。其他选项，在我看来很干净，是在你的应用程序的某个地方写一小段代码：但这有时可能很困难。其他选项，在我看来很干净，是在你的应用程序的某个地方写一小段代码：

from django.db import connection 
from django.core import signals 
def close_connection(**kwargs): 
    connection.close() 
signals.request_started.connect(close_connection)

Not ideal thought, connecting twice to the DB is a workaround at best.

不是理想的想法，连接两次到数据库充其量是一种解决方法。

Possible solution: using connection pooling (pgpool, pgbouncer), so you have DB connections pooled and stable, and handed fast to your FCGI daemons.

可能的解决方案：使用连接池（pgpool、pgbouncer），这样你就有了池化和稳定的数据库连接，并快速传递给你的 FCGI 守护进程。

The problem is that this triggers another bug, psycopg2 raising an InterfaceErrorbecause it's trying to disconnect twice (pgbouncer already handled this).

问题是这会触发另一个错误， psycopg2 引发InterfaceError因为它试图断开两次连接（pgbouncer 已经处理了这个）。

Now the culprit is Django signal request_finishedtriggering connection.close(), and failing loud even if it was already disconnected. I don't think this behavior is desired, as if the request already finished, we don't care about the DB connection anymore. A patch for correcting this should be simple.

现在罪魁祸首是 Django 信号request_finished触发了connection.close()，即使它已经断开连接也会大声失败。我不认为这种行为是可取的，就好像请求已经完成一样，我们不再关心数据库连接。用于纠正此问题的补丁应该很简单。

The relevant traceback:

相关回溯：

 /usr/local/lib/python2.6/dist-packages/Django-1.1.1-py2.6.egg/django/core/handlers/wsgi.py in __call__(self=<django.core.handlers.wsgi.WSGIHandler object at 0x24fb210>, environ={'AUTH_TYPE': 'Basic', 'DOCUMENT_ROOT': '/storage/test', 'GATEWAY_INTERFACE': 'CGI/1.1', 'HTTPS': 'off', 'HTTP_ACCEPT': 'application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5', 'HTTP_ACCEPT_ENCODING': 'gzip, deflate', 'HTTP_AUTHORIZATION': 'Basic dGVzdGU6c3VjZXNzbw==', 'HTTP_CONNECTION': 'keep-alive', 'HTTP_COOKIE': '__utma=175602209.1371964931.1269354495.126938948...none); sessionid=a1990f0d8d32c78a285489586c510e8c', 'HTTP_HOST': 'www.rede-colibri.com', ...}, start_response=<function start_response at 0x24f87d0>)
  246                 response = self.apply_response_fixes(request, response)
  247         finally:
  248             signals.request_finished.send(sender=self.__class__)
  249 
  250         try:
global signals = <module 'django.core.signals' from '/usr/local/l.../Django-1.1.1-py2.6.egg/django/core/signals.pyc'>, signals.request_finished = <django.dispatch.dispatcher.Signal object at 0x1975710>, signals.request_finished.send = <bound method Signal.send of <django.dispatch.dispatcher.Signal object at 0x1975710>>, sender undefined, self = <django.core.handlers.wsgi.WSGIHandler object at 0x24fb210>, self.__class__ = <class 'django.core.handlers.wsgi.WSGIHandler'>
 /usr/local/lib/python2.6/dist-packages/Django-1.1.1-py2.6.egg/django/dispatch/dispatcher.py in send(self=<django.dispatch.dispatcher.Signal object at 0x1975710>, sender=<class 'django.core.handlers.wsgi.WSGIHandler'>, **named={})
  164 
  165         for receiver in self._live_receivers(_make_id(sender)):
  166             response = receiver(signal=self, sender=sender, **named)
  167             responses.append((receiver, response))
  168         return responses
response undefined, receiver = <function close_connection at 0x197b050>, signal undefined, self = <django.dispatch.dispatcher.Signal object at 0x1975710>, sender = <class 'django.core.handlers.wsgi.WSGIHandler'>, named = {}
 /usr/local/lib/python2.6/dist-packages/Django-1.1.1-py2.6.egg/django/db/__init__.py in close_connection(**kwargs={'sender': <class 'django.core.handlers.wsgi.WSGIHandler'>, 'signal': <django.dispatch.dispatcher.Signal object at 0x1975710>})
   63 # when a Django request is finished.
   64 def close_connection(**kwargs):
   65     connection.close()
   66 signals.request_finished.connect(close_connection)
   67 
global connection = <django.db.backends.postgresql_psycopg2.base.DatabaseWrapper object at 0x17b14c8>, connection.close = <bound method DatabaseWrapper.close of <django.d...ycopg2.base.DatabaseWrapper object at 0x17b14c8>>
 /usr/local/lib/python2.6/dist-packages/Django-1.1.1-py2.6.egg/django/db/backends/__init__.py in close(self=<django.db.backends.postgresql_psycopg2.base.DatabaseWrapper object at 0x17b14c8>)
   74     def close(self):
   75         if self.connection is not None:
   76             self.connection.close()
   77             self.connection = None
   78 
self = <django.db.backends.postgresql_psycopg2.base.DatabaseWrapper object at 0x17b14c8>, self.connection = <connection object at 0x1f80870; dsn: 'dbname=co...st=127.0.0.1 port=6432 user=postgres', closed: 2>, self.connection.close = <built-in method close of psycopg2._psycopg.connection object at 0x1f80870>

Exception handling here could add more leniency:

这里的异常处理可以增加更多的宽大处理：

/usr/local/lib/python2.6/dist-packages/Django-1.1.1-py2.6.egg/django/db/__init__.py

   63 # when a Django request is finished.
   64 def close_connection(**kwargs):
   65     connection.close()
   66 signals.request_finished.connect(close_connection)

Or it could be handled better on psycopg2, so to not throw fatal errors if all we're trying to do is disconnect and it already is:

或者它可以在 psycopg2 上更好地处理，所以如果我们试图做的只是断开连接并且它已经是：

/usr/local/lib/python2.6/dist-packages/Django-1.1.1-py2.6.egg/django/db/backends/__init__.py

   74     def close(self):
   75         if self.connection is not None:
   76             self.connection.close()
   77             self.connection = None

Other than that, I'm short on ideas.

除此之外，我缺乏想法。

Answer 3

回答by Kalle

I try to give an answer to this even if I'am not using django but pyramid as the framework. I was running into this problem since a long time. Problem was, that it was really difficult to produce this error for tests... Anyway. Finally I solved it by digging through the whole stuff of sessions, scoped sessions, instances of sessions, engines and connections etc. I found this:

即使我没有使用 django 而是使用金字塔作为框架，我也会尝试对此给出答案。很长一段时间以来，我一直遇到这个问题。问题是，为测试产生这个错误真的很困难......无论如何。最后，我通过挖掘会话、范围会话、会话实例、引擎和连接等的全部内容来解决它。我发现了这个：

http://docs.sqlalchemy.org/en/rel_0_7/core/pooling.html#disconnect-handling-pessimistic

This approach simply adds a listener to the connection pool of the engine. In the listener a static select is queried to the database. If it fails the pool try to establish a new connection to the database before it fails at all. Important: This happens before any other stuff is thrown to the database. So it is possible to pre check connection what prevents the rest of your code from failing.

这种方法只是简单地向引擎的连接池添加一个侦听器。在侦听器中，向数据库查询静态选择。如果它失败，池尝试在它完全失败之前建立到数据库的新连接。重要提示：这发生在任何其他内容被抛出到数据库之前。因此，可以预先检查连接，防止其余代码失败。

This is not a clean solution since it don't solve the error itself but it works like a charm. Hope this helps someone.

这不是一个干净的解决方案，因为它不能解决错误本身，但它的作用就像一个魅力。希望这可以帮助某人。

Answer 4

回答by robinsax

An applicable quote:

适用的报价：

"2019 anyone?" - half of YouTube comments, circa 2019

If anyone is still dealing with this, make sure your app is "eagerly forking" such that your Python DB driver (psycopg2for me) isn't sharing resources between processes.

如果有人仍在处理这个问题，请确保您的应用程序“急切地分叉”，以便您的 Python DB 驱动程序（psycopg2对我而言）不会在进程之间共享资源。

I solved this issue on uWSGI by adding the lazy-apps = trueoption, which causes is to fork app processes right out of the gate, rather than waiting for copy-on-write. I imagine other WSGI / FastCGI hosts have similar options.

我通过添加lazy-apps = true选项解决了 uWSGI 上的这个问题，这会导致立即 fork 应用程序进程，而不是等待写时复制。我想其他 WSGI / FastCGI 主机也有类似的选项。

Answer 5

回答by diclophis

In the switch, did you change PostgreSQL client/server versions?

在切换中，您是否更改了 PostgreSQL 客户端/服务器版本？

I have seen similar problems with php+mysql, and the culprit was an incompatibility between the client/server versions (even though they had the same major version!)

我在 php+mysql 中看到过类似的问题，罪魁祸首是客户端/服务器版本之间的不兼容（即使它们具有相同的主要版本！）

Answer 6

回答by Peter Rowell

Smells like a possible threading problem. Django is notguaranteed thread-safe although the in-file docs seem to indicate that Django/FCGI can be run that way. Try running with prefork and then beat the crap out of the server. If the problem goes away ...

闻起来像一个可能的线程问题。尽管文件内文档似乎表明 Django/FCGI 可以以这种方式运行，但不保证Django 是线程安全的。尝试使用 prefork 运行，然后从服务器中删除废话。如果问题消失...

Answer 7

回答by cheeming

Maybe the PYTHONPATH and PATH environment variable is different for both setups (Apache+mod_python and lighttpd + FastCGI).

也许两种设置（Apache+mod_python 和 lighttpd + FastCGI）的 PYTHONPATH 和 PATH 环境变量不同。

Answer 8

回答by Matt

I fixed a similar issue when using a geodjango model that was not using the default ORM for one of its functions. When I added a line to manually close the connection the error went away.

在使用 geodjango 模型时，我修复了一个类似的问题，该模型未对其功能之一使用默认 ORM。当我添加一行手动关闭连接时，错误消失了。

http://code.djangoproject.com/ticket/9437

I still see the error randomly (~50% of requests) when doing stuff with user login/sessions however.

但是，在使用用户登录/会话进行操作时，我仍然会随机看到错误（约 50% 的请求）。

Answer 9

回答by shanyu

I went through the same problem recently (lighttpd, fastcgi & postgre). Searched for a solution for days without success, and as a last resort switched to mysql. The problem is gone.

我最近遇到了同样的问题（lighttpd、fastcgi 和 postgre）。搜索了几天没有成功的解决方案，作为最后的手段切换到mysql。问题消失了。

Answer 10

回答by slav0nic

Why not storing session in cache? Set

为什么不在缓存中存储会话？放

SESSION_ENGINE = "django.contrib.sessions.backends.cache"

Also you can try use postgres with pgbouncer(postgres - prefork server and don't like many connects/disconnects per time), but firstly check your postgresql.log.

您也可以尝试将 postgres 与pgbouncer一起使用（postgres - prefork 服务器并且不喜欢每次连接/断开很多次），但首先检查您的 postgresql.log。

Another version - you have many records in session tables and django-admin.py cleanupcan help.

另一个版本 - 您在会话表中有很多记录，django-admin.py cleanup可以提供帮助。

python Django + FastCGI - 随机引发 OperationalError

提问by ibz

采纳答案by ibz

回答by hcalves

回答by Kalle

回答by robinsax

回答by diclophis

回答by Peter Rowell

回答by cheeming

回答by Matt

回答by shanyu

回答by slav0nic

相关推荐

最近更新

标签

python Django + FastCGI - 随机引发 OperationalError

提问by ibz

采纳答案by ibz

回答by hcalves

回答by Kalle

回答by robinsax

回答by diclophis

回答by Peter Rowell

回答by cheeming

回答by Matt

回答by shanyu

回答by slav0nic

相关推荐

Python 字典的内存高效替代品

Python crypt 模块——盐的正确用法是什么？

Python 中的 foo 类和 foo(object) 类的区别

Python：无效的令牌

相关推荐

最近更新

标签