Python 遍历数据库表中所有行的最佳方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3785294/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Best way to iterate through all rows in a DB-table
提问by OemerA
I often write little Python scripts to iterate through all rows of a DB-table. For example sending all to all subscribers a email.
我经常编写小的 Python 脚本来遍历 DB 表的所有行。例如,向所有订阅者发送一封电子邮件。
I do it like this
我这样做
conn = MySQLdb.connect(host = hst, user = usr, passwd = pw, db = db)
cursor = conn.cursor()
subscribers = cursor.execute("SELECT * FROM tbl_subscriber;")
for subscriber in subscribers:
...
conn.close()
I wonder if there is a better way to do this cause it is possible that my code loads thousands of rows into the memory.
我想知道是否有更好的方法来做到这一点,因为我的代码可能会将数千行加载到内存中。
I thought about that it could be done better with LIMIT.
Maybe something like that:
我想它可以做得更好LIMIT。也许是这样的:
"SELECT * FROM tbl_subscriber LIMIT %d,%d;" % (actualLimit,steps)
Whats the best way to do it? How would you do it?
最好的方法是什么?你会怎么做?
采纳答案by aaronasterling
unless you have BLOBs in there, thousands of rows shouldn't be a problem. Do you know that it is?
除非那里有 BLOB,否则数千行应该不是问题。你知道它是吗?
Also, why bring shame on yourself and your entire family by doing something like
另外,为什么要通过做类似的事情来给自己和整个家庭带来耻辱
"SELECT * FROM tbl_subscriber LIMIT %d,%d;" % (actualLimit,steps)
when the cursor will make the substitution for you in a manner that avoids SQL injection?
游标何时会以一种避免 SQL 注入的方式为您进行替换?
c.execute("SELECT * FROM tbl_subscriber LIMIT %i,%i;", (actualLimit,steps))
回答by Katalonis
First of all maybe you don't need Select * from...
首先,也许您不需要 Select * from...
maybe it's enough for you just to get some stuff like: "SELECT email from..."
也许这足以让你得到一些东西,比如:“从...中选择电子邮件”
that would decrease the amount of memory usage anyway:)
无论如何,这将减少内存使用量:)
回答by Bj?rn Pollex
Do you have actual memory problems? When iterating over a cursor, results are fetched one at a time (your DB-API implementation might decide to prefetch results, but then it might offer a function to set the number of prefetched results).
你有实际的记忆问题吗?迭代游标时,一次获取一个结果(您的 DB-API 实现可能决定预取结果,但它可能会提供一个函数来设置预取结果的数量)。
回答by Andrew
Most MySQL connectors based on libmysqlclient will buffer all the results in client memory by default for performance reasons (with the assumption you won't be reading large resultsets).
出于性能原因,大多数基于 libmysqlclient 的 MySQL 连接器默认会将所有结果缓存在客户端内存中(假设您不会读取大型结果集)。
When you do need to read a large result in MySQLdb you can use a SSCursor to avoid buffering entire large resultsets.
当您确实需要在 MySQLdb 中读取大结果时,您可以使用 SSCursor 来避免缓冲整个大结果集。
http://mysql-python.sourceforge.net/MySQLdb.html#using-and-extending
http://mysql-python.sourceforge.net/MySQLdb.html#using-and-extending
SSCursor - A "server-side" cursor. Like Cursor but uses CursorUseResultMixIn. Use only if you are dealing with potentially large result sets.
SSCursor - “服务器端”游标。类似于 Cursor 但使用 CursorUseResultMixIn。仅在您处理潜在的大型结果集时使用。
This does introduce complications that you must be careful of. If you don't read all the results from the cursor, a second query will raise an ProgrammingError:
这确实会引入您必须小心的并发症。如果您没有从游标中读取所有结果,则第二个查询将引发 ProgrammingError:
>>> import MySQLdb
>>> import MySQLdb.cursors
>>> conn = MySQLdb.connect(read_default_file='~/.my.cnf')
>>> curs = conn.cursor(MySQLdb.cursors.SSCursor)
>>> curs.execute('SELECT * FROM big_table')
18446744073709551615L
>>> curs.fetchone()
(1L, '2c57b425f0de896fcf5b2e2f28c93f66')
>>> curs.execute('SELECT NOW()')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.6/site-packages/MySQLdb/cursors.py", line 173, in execute
self.errorhandler(self, exc, value)
File "/usr/lib64/python2.6/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
_mysql_exceptions.ProgrammingError: (2014, "Commands out of sync; you can't run this command now")
This means you have to always read everything from the cursor (and potentially multiple resultsets) before issuing another - MySQLdb won't do this for you.
这意味着在发布另一个之前,您必须始终从游标(以及可能的多个结果集)读取所有内容 - MySQLdb 不会为您执行此操作。
回答by dugres
You don't have to modify the query, you can use the fetchmanymethod of cursors. Here is how I do it :
您不必修改查询,您可以使用游标的fetchmany方法。这是我的方法:
def fetchsome(cursor, some=1000):
fetch = cursor.fetchmany
while True:
rows = fetch(some)
if not rows: break
for row in rows:
yield row
This way you can "SELECT * FROM tbl_subscriber;" but you will only fetch someat a time.
这样你就可以“SELECT * FROM tbl_subscriber;” 但你一次只能取一些。

