Java 使用 JDBC 进行批量插入的有效方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3784197/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-14 04:57:24  来源:igfitidea点击:

Efficient way to do batch INSERTS with JDBC

javasqlperformancejdbc

提问by Aayush Puri

In my app I need to do a lot of INSERTS. Its a Java app and I am using plain JDBC to execute the queries. The DB being Oracle. I have enabled batching though, so it saves me network latencies to execute queries. But the queries execute serially as separate INSERTs:

在我的应用程序中,我需要做很多插入。它是一个 Java 应用程序,我使用普通的 JDBC 来执行查询。数据库是 Oracle。不过,我启用了批处理,因此它节省了我执行查询的网络延迟。但是查询作为单独的 INSERT 串行执行:

insert into some_table (col1, col2) values (val1, val2)
insert into some_table (col1, col2) values (val3, val4)
insert into some_table (col1, col2) values (val5, val6)

I was wondering if the following form of INSERT might be more efficient:

我想知道以下形式的 INSERT 是否可能更有效:

insert into some_table (col1, col2) values (val1, val2), (val3, val4), (val5, val6)

i.e. collapsing multiple INSERTs into one.

即将多个 INSERT 合并为一个。

Any other tips for making batch INSERTs faster?

使批量插入更快的任何其他提示?

回答by Burleigh Bear

You'll have to benchmark, obviously, but over JDBC issuing multiple inserts will be much faster if you use a PreparedStatement rather than a Statement.

显然,您必须进行基准测试,但是如果您使用 PreparedStatement 而不是 Statement,那么通过 JDBC 发出多个插入会快得多。

回答by Bozho

The Statementgives you the following option:

Statement为您提供了以下选项:

Statement stmt = con.createStatement();

stmt.addBatch("INSERT INTO employees VALUES (1000, 'Joe Jones')");
stmt.addBatch("INSERT INTO departments VALUES (260, 'Shoe')");
stmt.addBatch("INSERT INTO emp_dept VALUES (1000, 260)");

// submit a batch of update commands for execution
int[] updateCounts = stmt.executeBatch();

回答by Tusc

This is a mix of the two previous answers:

这是前面两个答案的混合:

  PreparedStatement ps = c.prepareStatement("INSERT INTO employees VALUES (?, ?)");

  ps.setString(1, "John");
  ps.setString(2,"Doe");
  ps.addBatch();

  ps.clearParameters();
  ps.setString(1, "Dave");
  ps.setString(2,"Smith");
  ps.addBatch();

  ps.clearParameters();
  int[] results = ps.executeBatch();

回答by Farid

How about using the INSERT ALL statement ?

如何使用 INSERT ALL 语句?

INSERT ALL

INTO table_name VALUES ()

INTO table_name VALUES ()

...

SELECT Statement;

I remember that the last select statement is mandatory in order to make this request succeed. Don't remember why though. You might consider using PreparedStatementinstead as well. lots of advantages !

我记得为了使这个请求成功,最后一个 select 语句是强制性的。不记得为什么了。您也可以考虑使用PreparedStatement。很多优点!

Farid

法里德

回答by Mickey

Using PreparedStatements will be MUCH slower than Statements if you have low iterations. To gain a performance benefit from using a PrepareStatement over a statement, you need to be using it in a loop where iterations are at least 50 or higher.

如果迭代次数较少,则使用 PreparedStatements 将比 Statements 慢得多。要通过在语句上使用 PrepareStatement 获得性能优势,您需要在迭代至少为 50 或更高的循环中使用它。

回答by PD Shah 5382

Batch insert using statement

批量插入 using 语句

int a= 100;
            try {
                        for (int i = 0; i < 10; i++) {
                            String insert = "insert into usermaster"
                                    + "("
                                    + "userid"
                                    + ")"
                                    + "values("
                                    + "'" + a + "'"
                                    + ");";
                            statement.addBatch(insert);
                            System.out.println(insert);
                            a++;
                        }
                      dbConnection.commit();
                    } catch (SQLException e) {
                        System.out.println(" Insert Failed");
                        System.out.println(e.getMessage());
                    } finally {

                        if (statement != null) {
                            statement.close();
                        }
                        if (dbConnection != null) {
                            dbConnection.close();
                        }
                    }

回答by user1454294

You can use addBatch and executeBatch for batch insert in java See the Example : Batch Insert In Java

您可以在 Java 中使用 addBatch 和 executeBatch 进行批量插入,请参见示例:Java 中的批量插入

回答by prayagupd

Though the question asks inserting efficiently to Oracle using JDBC, I'm currently playing with DB2 (On IBM mainframe), conceptually inserting would be similar so thought it might be helpful to see my metrics between

尽管问题要求使用 JDBC 有效地插入到 Oracle 中,但我目前正在使用 DB2(在 IBM 大型机上),从概念上讲,插入是相似的,因此认为查看我的指标之间可能会有所帮助

  • inserting one record at a time

  • inserting a batch of records (very efficient)

  • 一次插入一条记录

  • 插入一批记录(非常有效)

Here go the metrics

这是指标

1) Inserting one record at a time

1) 一次插入一条记录

public void writeWithCompileQuery(int records) {
    PreparedStatement statement;

    try {
        Connection connection = getDatabaseConnection();
        connection.setAutoCommit(true);

        String compiledQuery = "INSERT INTO TESTDB.EMPLOYEE(EMPNO, EMPNM, DEPT, RANK, USERNAME)" +
                " VALUES" + "(?, ?, ?, ?, ?)";
        statement = connection.prepareStatement(compiledQuery);

        long start = System.currentTimeMillis();

        for(int index = 1; index < records; index++) {
            statement.setInt(1, index);
            statement.setString(2, "emp number-"+index);
            statement.setInt(3, index);
            statement.setInt(4, index);
            statement.setString(5, "username");

            long startInternal = System.currentTimeMillis();
            statement.executeUpdate();
            System.out.println("each transaction time taken = " + (System.currentTimeMillis() - startInternal) + " ms");
        }

        long end = System.currentTimeMillis();
        System.out.println("total time taken = " + (end - start) + " ms");
        System.out.println("avg total time taken = " + (end - start)/ records + " ms");

        statement.close();
        connection.close();

    } catch (SQLException ex) {
        System.err.println("SQLException information");
        while (ex != null) {
            System.err.println("Error msg: " + ex.getMessage());
            ex = ex.getNextException();
        }
    }
}

The metrics for 100 transactions :

100 笔交易的指标:

each transaction time taken = 123 ms
each transaction time taken = 53 ms
each transaction time taken = 48 ms
each transaction time taken = 48 ms
each transaction time taken = 49 ms
each transaction time taken = 49 ms
...
..
.
each transaction time taken = 49 ms
each transaction time taken = 49 ms
total time taken = 4935 ms
avg total time taken = 49 ms

The first transaction is taking around 120-150mswhich is for the query parseand then execution, the subsequent transactions are only taking around 50ms. (Which is still high, but my database is on a different server(I need to troubleshoot the network))

第一个事务正在执行,120-150ms用于查询解析然后执行,后续事务仅执行50ms。(仍然很高,但我的数据库在不同的服务器上(我需要对网络进行故障排除))

2) With insertion in a batch (efficient one)- achieved by preparedStatement.executeBatch()

2)批量插入(有效的)- 通过preparedStatement.executeBatch()

public int[] writeInABatchWithCompiledQuery(int records) {
    PreparedStatement preparedStatement;

    try {
        Connection connection = getDatabaseConnection();
        connection.setAutoCommit(true);

        String compiledQuery = "INSERT INTO TESTDB.EMPLOYEE(EMPNO, EMPNM, DEPT, RANK, USERNAME)" +
                " VALUES" + "(?, ?, ?, ?, ?)";
        preparedStatement = connection.prepareStatement(compiledQuery);

        for(int index = 1; index <= records; index++) {
            preparedStatement.setInt(1, index);
            preparedStatement.setString(2, "empo number-"+index);
            preparedStatement.setInt(3, index+100);
            preparedStatement.setInt(4, index+200);
            preparedStatement.setString(5, "usernames");
            preparedStatement.addBatch();
        }

        long start = System.currentTimeMillis();
        int[] inserted = preparedStatement.executeBatch();
        long end = System.currentTimeMillis();

        System.out.println("total time taken to insert the batch = " + (end - start) + " ms");
        System.out.println("total time taken = " + (end - start)/records + " s");

        preparedStatement.close();
        connection.close();

        return inserted;

    } catch (SQLException ex) {
        System.err.println("SQLException information");
        while (ex != null) {
            System.err.println("Error msg: " + ex.getMessage());
            ex = ex.getNextException();
        }
        throw new RuntimeException("Error");
    }
}

The metrics for a batch of 100 transactions is

一批 100 笔交易的指标是

total time taken to insert the batch = 127 ms

and for 1000 transactions

和 1000 笔交易

total time taken to insert the batch = 341 ms

So, making 100 transactions in ~5000ms(with one trxn at a time) is decreased to ~150ms(with a batch of 100 records).

因此,在~5000ms(一次一个 trxn)中进行 100 笔交易减少到~150ms(一批 100 条记录)。

NOTE - Ignore my network which is super slow, but the metrics values would be relative.

注意 - 忽略我的超慢网络,但指标值是相对的。

回答by user3211098

In my code I have no direct access to the 'preparedStatement' so I cannot use batch, I just pass it the query and a list of parameters. The trick however is to create a variable length insert statement, and a LinkedList of parameters. The effect is the same as the top example, with variable parameter input length.See below (error checking omitted). Assuming 'myTable' has 3 updatable fields: f1, f2 and f3

在我的代码中,我无法直接访问“preparedStatement”,因此我无法使用批处理,我只是将查询和参数列表传递给它。然而,诀窍是创建一个可变长度的插入语句和一个 LinkedList 参数。效果和上面的例子一样,参数输入长度可变。见下文(错误检查省略)。假设“myTable”有 3 个可更新字段:f1、f2 和 f3

String []args={"A","B","C", "X","Y","Z" }; // etc, input list of triplets
final String QUERY="INSERT INTO [myTable] (f1,f2,f3) values ";
LinkedList params=new LinkedList();
String comma="";
StringBuilder q=QUERY;
for(int nl=0; nl< args.length; nl+=3 ) { // args is a list of triplets values
    params.add(args[nl]);
    params.add(args[nl+1]);
    params.add(args[nl+2]);
    q.append(comma+"(?,?,?)");
    comma=",";
}      
int nr=insertIntoDB(q, params);

in my DBInterface class I have:

在我的 DBInterface 类中,我有:

int insertIntoDB(String query, LinkedList <String>params) {
    preparedUPDStmt = connectionSQL.prepareStatement(query);
    int n=1;
    for(String x:params) {
        preparedUPDStmt.setString(n++, x);
    }
    int updates=preparedUPDStmt.executeUpdate();
    return updates;
}

回答by Alex Stanovsky

You can use this rewriteBatchedStatementsparameter to make the batch insert even faster.

您可以使用此rewriteBatchedStatements参数使批量插入更快。

you can read here about the param: MySQL and JDBC with rewriteBatchedStatements=true

您可以在此处阅读有关参数的信息:MySQL and JDBC with rewriteBatchedStatements=true