Java Spring 数据保存与 saveAll 性能

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/49869277/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 03:09:58  来源:igfitidea点击:

Spring data save vs saveAll performance

javaspringhibernatespring-dataspring-data-jpa

提问by Yottabyte

I'm trying to understand why saveAll has better performance than save in the Spring Data repositories. I'm using CrudRepositorywhich can be seen here.

我试图理解为什么 saveAll 比在 Spring Data 存储库中保存具有更好的性能。我正在使用CrudRepository它可以在这里看到。

To test I created and added 10k entities, which just have an id and a random string (for the benchmark I kept the string a constant), to a list. Iterating over my list and calling .saveon each element, it took 40 seconds. Calling .saveAllon the same entire list completed in 2 seconds. Calling .saveAllwith even 30k elements took 4 seconds. I made sure to truncate my table before performing each test. Even batching the .saveAllcalls to sublists of 50 took 10 seconds with 30k.

为了测试,我创建并添加了 10k 个实体,这些实体只有一个 id 和一个随机字符串(对于基准测试,我将字符串保持为常量),到列表中。迭代我的列表并调用.save每个元素,花了 40 秒。调用.saveAll相同的整个列表在 2 秒内完成。.saveAll使用 30k 个元素进行调用也需要 4 秒。我确保在执行每个测试之前截断我的表。即使.saveAll对 50 个子列表的调用进行批处理也需要 10 秒,而 30k。

The simple .saveAllwith the entire list seems to be the fastest.

.saveAll整个列表的简单似乎是最快的。

I tried to browse the Spring Data source code but thisis the only thing I found of value. Here it seems .saveAllsimply iterates over the entire Iterableand calls .saveon each one like I was doing. So how is it that much faster? Is it doing some transactional batching internally?

我试图浏览春数据源代码,但是是我发现的唯一有价值的东西。在这里,它似乎.saveAll只是像我一样遍历整个Iterable并调用.save每个。那么它是如何更快的呢?它是否在内部进行一些事务性批处理?

采纳答案by Yazan Jaber

Without having your code, I have to guess, I believe it has to do with the overhead of creating new transaction for each object saved in the case of saveversus opening one transaction in the case of saveAll.

无需你的代码,我猜,我相信这与创建保存的情况下,每个对象新的事务的开销做save对的情况下打开一个事务saveAll

Notice the definition of saveand saveAllthey are both annotated with @Transactional. If your project is configured properly, which seems to be the case since entities are being saved to the database, that means a transaction will be created whenever one of these methods are called. if you are calling savein a loop that means a new transaction is being created each time you call save, but in the case of saveAllthere is one call and therefor one transaction created regardless of the number of entities being saved.

注意的定义savesaveAll他们都与注解@Transactional。如果您的项目配置正确,这似乎是因为实体被保存到数据库中,这意味着只要调用这些方法之一,就会创建一个事务。如果您save在循环中调用,则意味着每次调用时都会创建一个新事务save,但在saveAll有一次调用的情况下,无论保存的实体数量如何,都会创建一个事务。

I'm assuming that the test is not itself being run within a transaction, if it were to be run within a transaction then all calls to save will run within that transaction since the the default transaction propagation is Propagation.REQUIRED, that means if there is a transaction already open the calls will be run within it. If your planning to use spring data I strongly recommend that you read about transaction management in Spring.

我假设测试本身不是在事务中运行的,如果它是在事务中运行,那么所有对 save 的调用都将在该事务中运行,因为默认的事务传播是Propagation.REQUIRED,这意味着如果有一个事务已经打开的调用将在其中运行。如果您打算使用 spring 数据,我强烈建议您阅读Spring 中的事务管理