Java 最大 JDBC 批处理大小是多少?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20883964/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is the max JDBC batch size?
提问by vlyalcin
I have a list and that list increasing continuously. I am doing add batch depend on the list size. I forgot to put limit for do executeBatch in specified size.
我有一个列表,并且该列表不断增加。我正在根据列表大小添加批处理。我忘记在指定的大小内对 do executeBatch 设置限制。
Program is working for hours. I dont want to stop, fix and start again for now.
程序运行数小时。我现在不想停下来,修复并重新开始。
My questions, what decides size of the adding batch? What is the max capacity of the batch to do executeBatch()
in a one time? How many time I can use addBatch
without do executeBatch()
?
我的问题,是什么决定了添加批次的大小?该批次executeBatch()
一次性的最大容量是多少?我可以使用多少时间addBatch
而不做executeBatch()
?
采纳答案by Craig Ringer
PgJDBC has some limitations regarding batches:
PgJDBC 对批处理有一些限制:
All request values, and all results, must be accumulated in memory. This includes large blob/clob results. So free memory is the main limiting factor for batch size.
Until PgJDBC 9.4 (not yet released), batches that return generated keys always do a round trip for every entry, so they're no better than individual statement executions.
Even in 9.4, batches that return generated keys only offer a benefit if the generated values are size limited. A single
text
,bytea
or unconstrainedvarchar
field in the requested result will force the driver to do a round trip for every execution.
所有请求值和所有结果都必须累积在内存中。这包括大 blob/clob 结果。所以空闲内存是批量大小的主要限制因素。
在 PgJDBC 9.4(尚未发布)之前,返回生成的键的批处理总是对每个条目进行一次往返,因此它们并不比单个语句执行好。
即使在 9.4 中,返回生成的键的批次也仅在生成的值大小受限的情况下提供好处。一个单一的
text
,bytea
或无约束varchar
的请求的结果字段将迫使司机做每次执行一个往返。
The benefit of batching is a reduction in network round trips. So there's much less point if your DB is local to your app server. There's a diminishing return with increasing batch size, because the total time taken in network waits falls off quickly, so it's often not work stressing about trying to make batches as big as possible.
批处理的好处是减少了网络往返。因此,如果您的数据库位于您的应用服务器本地,则意义要小得多。随着批量大小的增加,收益递减,因为网络等待所花费的总时间会迅速减少,因此强调尝试使批量尽可能大通常是行不通的。
If you're bulk-loading data, seriously consider using the COPY
API instead, via PgJDBC's CopyManager
, obtained via the PgConnection
interface. It lets you stream CSV-like data to the server for rapid bulk-loading with very few client/server round trips. Unfortunately, it's remarkably under-documented - it doesn't appear in the main PgJDBC docs at all, only in the API docs.
如果您要批量加载数据,请认真考虑改用COPY
API,通过 PgJDBC 的CopyManager
,通过PgConnection
接口获得。它使您可以将类似 CSV 的数据流式传输到服务器,以便以很少的客户端/服务器往返进行快速批量加载。不幸的是,它的文档非常少——它根本没有出现在主要的 PgJDBC 文档中,只出现在 API 文档中。
回答by asafm
AFAIK there is no limit beside the memory issue. regarding your question: the statement is sent to the DB only on execute batch so until you'll execute the batch the memory will continue to grow until you will get JavaHeapSpace or the batch will be sent to the DB.
AFAIK 除了内存问题之外没有任何限制。关于您的问题:该语句仅在执行批处理时发送到数据库,因此在您执行批处理之前,内存将继续增长,直到您获得 JavaHeapSpace 或批处理将发送到数据库。
回答by Martin
There may be a maximum number of parameter markers depending on the JDBC implementation.
根据 JDBC 实现,可能存在最大数量的参数标记。
For instance the PostgreSQL driver represents the number of parameters as a 2-byte integer, which in Java is at most 32768.
例如,PostgreSQL 驱动程序将参数数量表示为 2 字节整数,在 Java 中最多为 32768。