database 数据仓库和大数据之间的实际区别是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19043747/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 09:05:36  来源:igfitidea点击:

What is the actual difference between Data Warehouse & Big Data?

databasebigdatadata-warehouse

提问by Aditya

I know what is Data Warehouse & what is Big Data. But I am confused with Data Warehouse Vs Big Data. Both are same with different names or both are different(Conceptually & Physically).

我知道什么是数据仓库以及什么是大数据。但我对数据仓库与大数据感到困惑。两者都具有不同的名称或两者都不同(概念和物理上)。

回答by Uli Bethke

I know that this is an older thread but there have been some developments in the last year or so. Comparing the data warehouse to Hadoop is like comparing apples to oranges. The data warehouse is a concept: clean, integrated data of high quality. I don't think the need for a data warehouse will go away anytime soon. Hadoop on the other hand is a technology. It is a distributed compute framework to process large volumes of data. In the past data warehouses were typically built on relational databases and data warehouse appliances. However, over the last couple of years various limitations of the RDBMS have emerged (exploding license costs in the face of growing data volumes, poor fit for purpose for querying graphs and hierarchies and ingesting unstructured data types etc.). At the same time MPP SQL query engines on Hadoop have appeared such as Apache Drill that now make it possible to query data that sits on Hadoop.

我知道这是一个较旧的线程,但在过去一年左右的时间里有了一些发展。将数据仓库与 Hadoop 进行比较就像将苹果与橘子进行比较。数据仓库是一个概念:干净、集成的高质量数据。我认为对数据仓库的需求不会很快消失。另一方面,Hadoop 是一种技术。它是一种处理大量数据的分布式计算框架。过去,数据仓库通常建立在关系数据库和数据仓库设备上。然而,在过去几年中,RDBMS 出现了各种限制(面对不断增长的数据量,许可证成本呈爆炸式增长,不适合查询图形和层次结构以及摄取非结构化数据类型等)。

I have written a whole series of posts on the subject if you are interested in all of the details. Data Warehousing in the age of big data. The end of an era?

如果您对所有细节感兴趣,我已经写了关于该主题的一系列帖子。大数据时代的数据仓库。一个时代的结束?

回答by Alireza Fattahi

I found this http://www.b-eye-network.com/view/17017which describes the difference of big data and data ware house

我发现这个http://www.b-eye-network.com/view/17017描述了大数据和数据仓库的区别

when we compare a big data solution to a data warehouse, what do we find? We find that a big data solution is a technology and that data warehousing is an architecture. They are two very different things. A technology is just that – a means to store and manage large amounts of data. A data warehouse is a way of organizing data so that there is corporate credibility and integrity. When someone takes data from a data warehouse, that person knows that other people are using the same data for other purposes. There is a basis for reconcilability of data when there is a data warehouse.

当我们将大数据解决方案与数据仓库进行比较时,我们会发现什么?我们发现大数据解决方案是一种技术,而数据仓库是一种架构。它们是两个非常不同的东西。技术就是这样——一种存储和管理大量数据的方法。数据仓库是一种组织数据的方式,以确保企业的可信度和完整性。当某人从数据仓库中获取数据时,该人知道其他人将相同的数据用于其他目的。当有数据仓库时,就有了数据可协调性的基础。

回答by gazgas

I think you will find the following article very usefull to your thoughts.

我认为您会发现以下文章对您的想法非常有用。

It's important to divide the techniques of data warehousing from the implementation. Hadoop (and the advent of NoSQL databases) will auger the demise of data warehousing appliances and the “traditional” single database implementation of a data warehouse.  
It is safe to say that traditional, single server relational databases or database appliances are not the future of big data or data warehouses.
On the other hand, the techniques of data warehousing to include Extract-Transform-and-Load (ETL), dimensional modeling and business intelligence will be adapted to the new Hadoop/NoSQL environments. 

From: http://gcn.com/blogs/reality-check/2014/01/hadoop-vs-data-warehousing.aspx

来自:http: //gcn.com/blogs/reality-check/2014/01/hadoop-vs-data-warehousing.aspx

回答by Kai W?hner

回答by Alex Phell

Maybe this viewpoint can help you: Basically Data Warehouse is an architecture, while Big Data is a technology. The first one became a well-known trend in the recent 20 years, while the latter one gained popularity only in the last decade.

也许这个观点可以帮助你:基本上数据仓库是一种架构,而大数据是一种技术。第一个在最近 20 年成为众所周知的趋势,而后一个在最近十年才开始流行。

Big Data and Data Warehouse are both used for reporting and can be called subject-oriented technologies. This means that they are aimed to provide information about a certain subject (f.e. a customer, supplier, employee or even a product). Data Warehouse is more advanced when it comes to holistic data analysis, while the main advantage of Big Data is that you can gather and process information from almost all well-known sources (f.e. social media or even specific machine data).

大数据和数据仓库都用于报告,可以称为面向主题的技术。这意味着它们旨在提供有关特定主题(例如客户、供应商、员工甚至产品)的信息。数据仓库在整体数据分析方面更为先进,而大数据的主要优势是您可以从几乎所有知名来源(例如社交媒体甚至特定机器数据)收集和处理信息。

More here gbksoft.com/blog/big-data-and-data-warehouse/

更多信息请点击gbksoft.com/blog/big-data-and-data-warehouse/

回答by Que14

The warehouse stores the actual data. It stores some of the entire cluster data. Data Warehouse is a system used for reporting and data analysis. It is central repositories of integrated data from one or more disparate sources. They store current and historical data in one single place that are used for creating analytical reports.

仓库存储实际数据。它存储一些整个集群数据。数据仓库是用于报告和数据分析的系统。它是来自一个或多个不同来源的集成数据的中央存储库。它们将当前和历史数据存储在一个位置,用于创建分析报告。

vs.

对比

Big data refers to large-scale data that is generated in digital environment. This big data is generally large in size and has a short generation cycle. It includes not only numeric data but also text and image data. Big data environment is more diverse than previous ones. As data types are diverse and the amount of size is huge, It is even possible to analyze and predict people's opinions and behaviors. In addition, Machbase databasewill launch the enterprise edition which has a warehouse concept.

大数据是指在数字环境中产生的大规模数据。这种大数据一般规模大,生成周期短。它不仅包括数字数据,还包括文本和图像数据。大数据环境比以往更加多样化。由于数据类型多样,规模庞大,甚至可以分析和预测人们的意见和行为。此外,Machbase 数据库将推出具有仓库概念的企业版。