database “批量加载”是什么意思?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4462074/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 07:58:11  来源:igfitidea点击:

What does "bulk load" mean?

databaseterminologydata-warehousebulk-load

提问by Spredzy

Jumping from article to article, I can see everywhere the expression "bulk loading".

从一篇文章跳到另一篇文章,我到处都能看到“批量加载”这个词。

What does it really (technically) mean?

它真正(技术上)是什么意思?

What does it imply?

这意味着什么?

Explanation based on use-cases is welcome.

欢迎基于用例的解释。

回答by SingleNegationElimination

Indexes are usually optimized for inserting rows one at a time. When you are adding a great deal of data at once, inserting rows one at a time may be inefficient. For instance, with a B-Tree, the optimal way to insert a single key is very poor way of adding a bunch of data to an empty index.

索引通常针对一次插入一行进行优化。当您一次添加大量数据时,一次插入一行可能效率低下。例如,对于 B 树,插入单个键的最佳方式是将一堆数据添加到空索引的非常糟糕的方式。

Instead you pursue a different strategy with B-Trees. You presort all of the data, and group it in blocks. You can then build a new B-Tree by transforming the blocks into tree nodes. Although both techniques have the same asymptotic performance, O(n log(n)), the bulk-load operation has much smaller factor.

相反,您对 B 树采取不同的策略。您对所有数据进行预排序,并将其分组。然后,您可以通过将块转换为树节点来构建新的 B 树。尽管这两种技术具有相同的渐近性能 O(n log(n)),但批量加载操作的因子要小得多。

回答by KevinDTimm

Bulk loading is a way to load data (typically into a database) in 'large chunks'. Where you might enter a customer or a purchase order or information about items in inventory one at a time into your system, bulk loading takes a file of this same sort of information and loads hundreds/thousands/millions of records in a short period of time.

批量加载是一种以“大块”加载数据(通常加载到数据库中)的方法。您可能会一次将客户或采购订单或有关库存项目的信息输入系统,批量加载会获取此类信息的文件并在短时间内加载数百/数千/数百万条记录.

If you convert from one kind of DBMS to another, you would hope not to enter all the information into the new DB from the old DB. Instead, you would dump the information from the old DB to a file in a format that can be easily read by the new DB and then import that data into the new DB.

如果您从一种 DBMS 转换为另一种 DBMS,您可能希望不要将旧 DB 中的所有信息都输入到新 DB 中。相反,您可以将旧数据库中的信息转储到新数据库可以轻松读取的格式的文件中,然后将该数据导入新数据库。

That's what bulk loading entails (at the 35K foot level, anyway)

这就是批量装载所需要的(无论如何,在 35K 英尺的高度)

回答by vonPryz

Bulk loading is used to import/export large amounts of data. Usually bulk operations are not logged and transactional integrity might not work as expected. Often bulk operations bypass triggers and integrity checks like constraints. This improves performance, for large amounts of data, quite significantly.

批量加载用于导入/导出大量数据。通常不会记录批量操作,事务完整性可能无法按预期工作。批量操作通常会绕过触发器和完整性检查,如约束。对于大量数据,这显着提高了性能。

回答by BSharp

One thing to remember is that bulk loading implies that the data content from the source to target is the same, but this is only true if the source system is acquiesced. For any data source, and especially true of large data, the source data can change after it has been read and the data transfer is happening. Traditionally online systems either have to go off line or suspend updates if an exact point it time capture that matches the source is required.

需要记住的一件事是,批量加载意味着从源到目标的数据内容是相同的,但这仅在源系统被默认时才成立。对于任何数据源,尤其是大数据,源数据在读取和数据传输发生后都可能发生变化。如果需要与源匹配的准确时间点捕获,传统的在线系统要么必须离线,要么暂停更新。