Python 在 Django 中聚合 save()s?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3395236/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 10:49:49  来源:igfitidea点击:

Aggregating save()s in Django?

pythonsqldjangosqlite

提问by kdt

I'm using Django with an sqlite backend, and write performance is a problem. I may graduate to a "proper" db at some stage, but for the moment I'm stuck with sqlite. I think that my write performance problems are probably related to the fact that I'm creating a large number of rows, and presumably each time I save()one it's locking, unlocking and syncing the DB on disk.

我正在使用带有 sqlite 后端的 Django,写入性能是一个问题。我可能会在某个阶段毕业到“适当的”数据库,但目前我坚持使用 sqlite。我认为我的写入性能问题可能与我正在创建大量行的事实有关,并且大概每次我save()都会锁定、解锁和同步磁盘上的数据库。

How can I aggregate a large number of save()calls into a single database operation?

如何将大量save()调用聚合到单个数据库操作中?

采纳答案by JudoWill

EDITED: commit_on_successis deprecated and was removed in Django 1.8. Use transaction.atomicinstead. See Fraser Harris's answer.

已编辑:commit_on_success已弃用并已在 Django 1.8 中删除。使用transaction.atomic来代替。见弗雷泽哈里斯的回答

Actually this is easier to do then you think. You can use transactionsin Django. These batch database operations (specifically save, insert and delete) into one operation. I've found the easiest one to use is commit_on_success. Essentially you wrap your database save operations into a function and then use the commit_on_successdecorator.

实际上,这比您想象的要容易得多。您可以在 Django 中使用事务。这些批量数据库操作(特别是保存、插入和删除)合二为一。我发现最容易使用的是commit_on_success. 本质上,您将数据库保存操作包装到一个函数中,然后使用commit_on_success装饰器。

from django.db.transaction import commit_on_success

@commit_on_success
def lot_of_saves(queryset):
    for item in queryset:
        modify_item(item)
        item.save()

This will have a huge speed increase. You'll also get the benefit of having roll-backs if any of the items fail. If you have millions of save operations then you may have to commit them in blocks using the commit_manuallyand transaction.commit()but I've rarely needed that.

这将有巨大的速度提升。如果任何项目失败,您还将获得回滚的好处。如果您有数百万个保存操作,那么您可能必须使用commit_manuallyand将它们分块提交,transaction.commit()但我很少需要这样做。

Hope that helps,

希望有所帮助,

Will

将要

回答by S.Lott

"How can I aggregate a large number of save() calls into a single database operation?"

“如何将大量 save() 调用聚合到单个数据库操作中?”

You don't need to. Django already manages a cache for you. You can't improve it's DB caching by trying to fuss around with saves.

你不需要。Django 已经为你管理了一个缓存。您无法通过尝试在保存上大惊小怪来改进它的数据库缓存。

"write performance problems are probably related to the fact that I'm creating a large number of rows"

“写入性能问题可能与我正在创建大量行的事实有关”

Correct.

正确的。

SQLite is pretty slow. That's the way it is. Queries are faster than most other DB's. Writes are pretty slow.

SQLite 很慢。它就是这样儿的。查询比大多数其他数据库都要快。写的很慢。

Consider more serious architecture change. Are you loading rows during a web transaction (i.e., bulk uploading files and loading the DB from those files)?

考虑更严重的架构变化。您是否在 Web 事务期间加载行(即,批量上传文件并从这些文件加载​​数据库)?

If you're doing bulk loading inside a web transaction, stop. You need to do something smarter. Use celeryor use some other "batch" facility to do your loads in the background.

如果您在网络事务中进行批量加载,请停止。你需要做一些更聪明的事情。使用celery或使用其他一些“批处理”工具在后台进行加载。

We try to limit ourself to file validation in a web transaction and do the loads when the user's not waiting for their page of HTML.

我们尝试将自己限制在 Web 事务中的文件验证,并在用户不等待他们的 HTML 页面时进行加载。

回答by Fraser Harris

New as of Django 1.6 is atomic, a simple API to control DB transactions. Copied verbatim from the docs:

Django 1.6 的新功能是原子的,一个简单的 API 来控制数据库事务。从文档中逐字复制:

atomic is usable both as a decorator:

atomic 既可用作装饰器

from django.db import transaction

@transaction.atomic
def viewfunc(request):
    # This code executes inside a transaction.
    do_stuff()

and as a context manager:

并作为上下文管理器

from django.db import transaction

def viewfunc(request):
    # This code executes in autocommit mode (Django's default).
    do_stuff()

    with transaction.atomic():
        # This code executes inside a transaction.
        do_more_stuff()

Legacy django.db.transactionfunctions autocommit(), commit_on_success(), and commit_manually()have been deprecated and will be remove in Django 1.8.

遗留django.db.transaction函数autocommit()commit_on_success()commit_manually()已被弃用,并将在 Django 1.8 中删除。

回答by Chris Conlan

I think this is the method you are looking for: https://docs.djangoproject.com/en/dev/ref/models/querysets/#bulk-create

我认为这是您正在寻找的方法:https: //docs.djangoproject.com/en/dev/ref/models/querysets/#bulk-create

Code copied from the docs:

从文档中复制的代码:

Entry.objects.bulk_create([
    Entry(headline='This is a test'),
    Entry(headline='This is only a test'),
])

Which in practice, would look like:

在实践中,这看起来像:

my_entries = list()
for i in range(100):
    my_entries.append(Entry(headline='Headline #'+str(i))

Entry.objects.bulk_create(my_entries)

According to the docs, this executes a single query, regardless of the size of the list (maximum 999 items on SQLite3), which can't be said for the atomicdecorator.

根据文档,无论列表的大小(SQLite3 上最多 999 个项目),这都执行单个查询,这对于atomic装饰器来说是不可能的。

There is an important distinction to make. It sounds like, from the OP's question, that he is attempted to bulk createrather than bulk save. The atomicdecorator is the fastest solution for saving, but not for creating.

有一个重要的区别。听起来,从 OP 的问题来看,他试图批量创建而不是批量保存。该atomic装饰是最快的解决方案节省,而不是创造