database “增量负载”是什么意思?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4471161/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 07:58:23  来源:igfitidea点击:

What does "incremental load" mean?

databaseterminologydata-warehouse

提问by Spredzy

I regularly see the expression 'incremental loading' when reading articles

我在阅读文章时经常看到“增量加载”这个词

What does is really (technically) mean? What does it implies ?

真正(技术上)是什么意思?这意味着什么?

Explanations using use-cases are welcome.

欢迎使用用例进行解释。

回答by Kris C

It generally means only loading into the warehouse the records that have changed (inserts, updates, and deletes if applicable) since the last load; as opposed to doing a full load of all the data (all records, including those that haven't changed since the last load) into the warehouse.

它通常意味着只将自上次加载以来发生更改(插入、更新和删除,如果适用)的记录​​加载到仓库中;而不是将所有数据(所有记录,包括自上次加载后未更改的记录)全部加载到仓库中。

The advantage is that it reduces the amount of data being transferred from system to system, as a full load may take hours / days to complete depending on volume of data.

优点是它减少了从系统传输到系统的数据量,因为满载可能需要数小时/数天才能完成,具体取决于数据量。

The main disadvantage is around maintainability. With a full load, if there's an error you can re-run the entire load without having to do much else in the way of cleanup / preparation. With an incremental load, the files generally need to be loaded in order. So if you have a problem with one batch, others queue up behind it until you correct it. Alternately you may find an error in a batch from a few days ago, and need to re-load that batch once corrected, followed by every subsequent batch in order to ensure that the data in the warehouse is consistent.

主要的缺点是可维护性。在满载情况下,如果出现错误,您可以重新运行整个加载程序,而无需在清理/准备方面做太多其他工作。对于增量加载,文件通常需要按顺序加载。所以如果你对一批有问题,其他人会排在后面,直到你纠正它。或者,您可能会在几天前发现一个批次中的错误,并且需要在更正后重新加载该批次,然后再加载每个后续批次,以确保仓库中的数据一致。

回答by Martin

Incremental loading is used when moving data from one repository (Database) to another.

将数据从一个存储库(数据库)移动到另一个存储库时使用增量加载。

Non-incremental loading would be when the destination has the entire data from the source pushed to it.

非增量加载是当目标将源中的全部数据推送到它时。

Incremental would be only passing across the new and amended data.

增量只会传递新的和修改过的数据。

A concrete example:

一个具体的例子:

A company may have two platforms, one that processes orders, and a seperate accounting system. The accounts department enters new customer details into the accounting system but has to ensure these customers appear in the order processing system.

To do this it runs a nightly batch job that sends data from the accounting system to the order system.

If they were deleting all customer details in the order system and refilling with all the customers in the accounting system then they would be performing a non-incremental load.

If they only sent accross the new customers and the customers that had been changed they would be performing an incremental load.

一家公司可能有两个平台,一个处理订单,一个单独的会计系统。会计部门将新客户的详细信息输入会计系统,但必须确保这些客户出现在订单处理系统中。

为此,它每晚运行一个批处理作业,将数据从会计系统发送到订单系统。

如果他们删除订单系统中的所有客户详细信息并重新填充会计系统中的所有客户,那么他们将执行非增量加载。

如果他们只发送新客户和已更改的客户,他们将执行增量加载。