Python3 - 有没有办法在非常大的 SQlite 表上逐行迭代而不将整个表加载到本地内存中?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29582736/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 04:44:44  来源:igfitidea点击:

Python3 - Is there a way to iterate row by row over a very large SQlite table without loading the entire table into local memory?

pythonsqlite

提问by Marlon Dyck

I have a very large table with 250,000+ rows, many containing a large text block in one of the columns. Right now it's 2.7GB and expected to grow at least tenfold. I need to perform python specific operations on every row of the table, but only need to access one row at a time.

我有一个非常大的表格,其中有 250,000 多行,其中许多行在其中一列中包含一个大文本块。现在它是 2.7GB,预计至少会增长十倍。我需要对表的每一行执行 python 特定的操作,但一次只需要访问一行。

Right now my code looks something like this:

现在我的代码看起来像这样:

c.execute('SELECT * FROM big_table') 
table = c.fetchall()
for row in table:
    do_stuff_with_row

This worked fine when the table was smaller, but the table is now larger than my available ram and python hangs when I try and run it. Is there a better (more ram efficient) way to iterate row by row over the entire table?

当表较小时,这工作正常,但是当我尝试运行它时,该表现在比我可用的 ram 大,并且 python 挂起。有没有更好的(更高效的)方法来逐行迭代整个表?

采纳答案by Martijn Pieters

cursor.fetchall()fetches all results into a list first.

cursor.fetchall()首先将所有结果提取到列表中。

Instead, you can iterate over the cursor itself:

相反,您可以遍历游标本身

c.execute('SELECT * FROM big_table') 
for row in c:
    # do_stuff_with_row

This produces rows as needed, rather than load them all first.

这会根据需要生成行,而不是首先加载它们。