java 使用 IBATIS 进行 INSERTS 的最快方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/288256/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Fastest way for doing INSERTS using IBATIS
提问by muriloq
I need to insert 20,000 rows in a single table (SQL Server 2005) using iBatis. What's the fastest way to do it ? I'm already using batch mode, but it didn't help much:
我需要使用 iBatis 在单个表 (SQL Server 2005) 中插入 20,000 行。最快的方法是什么?我已经在使用批处理模式,但它没有多大帮助:
try {
sqlMap.startTransaction();
sqlMap.startBatch();
// ... execute statements in between
sqlMap.commitTransaction();
} finally {
sqlMap.endTransaction();
}
回答by Will Hartung
Barring the bulk loaders others are referring to, let's consider how to best do it through SQL. (And the bulk loaders don't work well if you're sending intermixed data to different tables.)
除了其他人所指的批量加载器,让我们考虑如何最好地通过 SQL 来完成它。(如果您将混合数据发送到不同的表,则批量加载器无法正常工作。)
First, you shouldn't be using whatever abstraction layer you're using, in this case iBatis, as it effectively will offer you little value, but that abstraction layer will have some (not necessarily much, but some) CPU cost. You should really simply use a raw database connection.
首先,您不应该使用您正在使用的任何抽象层,在这种情况下是 iBatis,因为它实际上不会为您提供很少的价值,但该抽象层将有一些(不一定是很多,但有一些)CPU 成本。您真的应该简单地使用原始数据库连接。
Next, you'll be sending in a mess of INSERT statements. The question is whether you should use a simple string for the statment, (i.e. INSERT INTO TABLE1 VALUES('x','y', 12)) vs a prepared statement (INSERT INTO TABLE1 VALUES(?, ?, ?)).
接下来,您将发送一堆 INSERT 语句。问题是您是否应该为语句使用简单的字符串(即 INSERT INTO TABLE1 VALUES('x','y', 12)) 与准备好的语句(INSERT INTO TABLE1 VALUES(?, ?, ?))。
That will depend on your database and DB drivers.
这将取决于您的数据库和数据库驱动程序。
The issue with using a simple string, is basically the conversion cost from an internal format (assuming you're inserting Java data) to the string. Converting a number or date to a String is actually a reasonably expensive CPU operation. Some databases and drivers will work with the binary data directly, rather than simply the string data. So, in that case a PreparedStatement could net some CPU savings in potentially not having to convert the data.
使用简单字符串的问题基本上是从内部格式(假设您插入 Java 数据)到字符串的转换成本。将数字或日期转换为字符串实际上是一个相当昂贵的 CPU 操作。一些数据库和驱动程序将直接处理二进制数据,而不仅仅是字符串数据。因此,在这种情况下,PreparedStatement 可以节省一些 CPU,因为可能不必转换数据。
The downside is that this factor will vary by DB vendor, and potentially even the JDBC vendor. For example, Postgres (I believe) only works with SQL strings, rather than binary, so using a PreparedStatement is a waste over simply building the string yourself.
缺点是这个因素会因数据库供应商而异,甚至可能因 JDBC 供应商而异。例如,Postgres(我相信)仅适用于 SQL 字符串,而不是二进制,因此使用 PreparedStatement 与简单地自己构建字符串相比是一种浪费。
Next, once you have your statement type, you want to use the addBatch() method of the JDBC Statement class. What addBatch does is it groups up the SQL statements in to, well, a batch. The benefit is that instead of sending several requests to the DB, you send a single LARGE request. This cuts down on network traffic, and will give some noticeable gains in throughput.
接下来,一旦您有了语句类型,您就想使用JDBC Statement 类的addBatch() 方法。addBatch 的作用是将 SQL 语句组合成一个批处理。好处是您无需向数据库发送多个请求,而是发送一个 LARGE 请求。这减少了网络流量,并将显着提高吞吐量。
The detail is that not all drivers/databases support addBatch (at least not well), but also the size of your batch is limited. You most likely can't addBatch for all 20,000 rows and expect it to work, though that would be the best bet. This limit, also, can vary by database.
细节是并非所有驱动程序/数据库都支持 addBatch(至少不是很好),而且您的批处理大小是有限的。您很可能无法为所有 20,000 行 addBatch 并期望它起作用,尽管这将是最好的选择。此限制也可能因数据库而异。
For Oracle, in the past, I used a buffer of 64K. Basically I wrote a wrapper function that would take a literal INSERT statement, and accumulate them in 64K batches.
对于Oracle,过去我用的是64K的缓冲区。基本上,我编写了一个包装函数,该函数将采用文字 INSERT 语句,并将它们累积为 64K 批次。
So, if you wanted to bulk insert data through SQL via JDBC, those are the ways to do it. The big improvement is the Batch mode, the Statement vs PreparedStatement is more to potentially conserve some CPU, and maybe network traffic if your driver supports a binary protocol.
因此,如果您想通过 JDBC 通过 SQL 批量插入数据,这些就是方法。最大的改进是批处理模式,Statement vs PreparedStatement 更可能节省一些 CPU,如果您的驱动程序支持二进制协议,可能还有网络流量。
Test, rinse, and repeat until you're happy enough.
测试、冲洗并重复,直到您足够开心为止。
回答by benPearce
Although this is not specific to your db server, I have previously had success writing the rows out to a local file in csv format and then having the database importing the file. This was considerably faster than insert statements or even a batch insert.
尽管这不是特定于您的数据库服务器,但我之前已经成功地将行以 csv 格式写入本地文件,然后让数据库导入该文件。这比插入语句甚至批量插入要快得多。
回答by Joel Coehoorn
In SQL Server, the fasted way to insert records in batch is using BULK INSERT. However, this method loads the records from a text file rather than directly from your application.
在 SQL Server 中,批量插入记录的快速方法是使用BULK INSERT。但是,此方法从文本文件而不是直接从您的应用程序加载记录。
It also doesn't take into the account the time spent creating the file. You may have to weigh if that offsets any speed gains from the actual insert. Keep in mind that even if this is a little slower overall, you'll end up tying up your database server for less time.
它也不考虑创建文件所花费的时间。您可能需要权衡这是否会抵消实际插入的任何速度增益。请记住,即使总体上速度稍慢,您最终也会占用数据库服务器的时间更少。
The only other thing you might try is inserting (staging) the batch into a completely different table (with no indexes or anything). Then move the record from that staging table to your target table and drop the staging table. This would move the data to server first, so that the final insert could all happen with sql server itself. But again: it's a two step process, so you'll have to count the time for both steps.
您可能会尝试的另一件事是将批处理插入(暂存)到一个完全不同的表中(没有索引或任何东西)。然后将记录从该临时表移动到您的目标表并删除该临时表。这将首先将数据移动到服务器,以便最终的插入都可以在 sql server 本身发生。但同样:这是一个两步过程,所以你必须计算这两个步骤的时间。
回答by S.Lott
Bulk inserts are best done using the database's own bulk loader tools. For Oracle, it's SQL*Loader, for example. Often these are faster than anything you could ever write.
批量插入最好使用数据库自己的批量加载器工具来完成。例如,对于 Oracle,它是 SQL*Loader。通常这些比你能写的任何东西都快。

