oracle 使用规范化表真的更好吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/540136/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 17:48:37  来源:igfitidea点击:

Is it really better to use normalized tables?

sql-serverdatabaseoraclenormalization

提问by Invisible Coder

I heard my team leader say that in some past projects they had to do away with normalization to make the queries faster.

我听我的团队领导说,在过去的一些项目中,他们不得不取消规范化以加快查询速度。

I think it may have something to do with table unions.

我认为这可能与表联合有关。

Is having more lean tables really less efficient than having few fat tables?

拥有更多的瘦桌真的比拥有更少的胖桌效率低吗?

回答by MrTelly

It depends ... joining tables is inherently slower than having one big table that is 'pre-joined' ie de-normalised. However, by denormalising you're going to create data duplication and your tables are going to be larger. Normalisation is seen as a good thing, because it creates databases that can answer 'any' question, if it is properly done you can build a select to get to your data. This is not the case in some other forms of DB, and those are now (mostly) historic irrelevancies, the normalised/relation DB won that battle.

这取决于......加入表本质上比拥有一个“预先加入”的大表慢,即非规范化。但是,通过非规范化,您将创建数据重复并且您的表将变得更大。规范化被视为一件好事,因为它创建了可以回答“任何”问题的数据库,如果正确完成,您可以构建一个选择来获取您的数据。在某些其他形式的 DB 中情况并非如此,这些现在(大部分)是历史性的无关紧要,规范化/关系 DB 赢得了这场战斗。

Back to your question, using de-normalisation to make things go faster is a well accepted technique. It's normally best to run your DB for a while so you know what to de-normalise and what to leave alone, and it's also common to leave the data in its 'correct' normalised form and pull data into a set of de-normalised reporting tables on a regular basis. If that process is done as part of the report run itself then the data is always up to date too.

回到你的问题,使用反规范化让事情变得更快是一种被广泛接受的技术。通常最好运行你的数据库一段时间,这样你就知道什么是非规范化的,什么是单独留下的,并且将数据保留其“正确”规范化形式并将数据提取到一组非规范化报告中也是很常见的定期表。如果该过程作为报告运行的一部分完成,那么数据也始终是最新的。

As an example of over-normalisation I've seen DBs in the past where the days of the week, and months of the year were pulled out into separate tables - dates themselves were normalised - you can go too far.

作为过度规范化的一个例子,我在过去看到过数据库,其中一周中的几天和一年中的几个月被提取到单独的表中 - 日期本身被标准化 - 你可以走得太远。

回答by Eric King

You should do some research on the differences between OLTP(Online Transaction Processing) and OLAP(Online Analytical Processing) databases.

您应该对OLTP(在线事务处理)和OLAP(在线分析处理)数据库之间的差异进行一些研究。

In short, database systems which are concerned primarily with recording transactions (OLTP) are usually structured in a more normalized fashion, reducing data duplication and easing the creation and updating of records at the expense of optimized data retrieval.

简而言之,主要关注记录事务 (OLTP) 的数据库系统通常以更规范的方式构建,以优化数据检索为代价减少数据重复并简化记录的创建和更新。

Database systems which are more concerned with data retrieval and analysis (OLAP) are usually structured in a less normalized fashion, sacrificing data storage optimization so to maximize querying and analysis speed.

更关注数据检索和分析 (OLAP) 的数据库系统通常以不太规范的方式构建,牺牲数据存储优化以最大限度地提高查询和分析速度。

Database normalizationand Denormalizationare at the heart of this trade off.

数据库规范化非规范化是这种权衡的核心。

回答by gimel

Jeff wrote about this, followed by a heated discussion. It is also subject of much discussion on SO, e.g. whats the better database design more tables or more columns. As others have pointed, use common sense and do not over-normalize.

杰夫写了这篇文章,随后进行了激烈的讨论。这也是关于 SO 的许多讨论的主题,例如,设计更多表或更多列的数据库更好。正如其他人指出的那样,使用常识,不要过度规范化。

回答by Tony Andrews

In my long experience with Oracle OLTPdatabases, some of them very large and busy, I can honestly say I can't remember ever having come across a case where "denormalisation for performance" was truly required. I have, however, seen many cases where someone has decided in advance that denormalisation should be applied because of their fear, uncertainty and doubt about potential performance issues. This has usually been done without any benchmarking, and invariably I find that no performance improvement has been achieved in fact - but the data maintenance code has become far more complex than it would have been.

在我长期使用 Oracle OLTP数据库的经验中,其中一些非常大而且很忙,老实说,我不记得曾经遇到过真正需要“性能非规范化”的情况。然而,我见过很多案例,有人因为害怕、不确定和对潜在性能问题的怀疑而提前决定应该应用非规范化。这通常是在没有任何基准测试的情况下完成的,而且我总是发现实际上并没有实现性能改进 - 但是数据维护代码变得比本来要复杂得多。

OLAP is a very different animal, and I'm not in a position to comment about that.

OLAP 是一种非常不同的动物,我无法对此发表评论。

回答by Doug McClean

This question recurs altogether too often. The primary reason is that SQL, the most popular database language by a huge margin, and all of its most popular implementations, conflate logical table design with physical table design.

这个问题完全重复出现太频繁了。主要原因是 SQL,这是最流行的数据库语言,其所有最流行的实现都将逻辑表设计与物理表设计混为一谈。

The eternal answer is that you should always normalize your logical tables, but the pragmatic answer is complicated by the fact that the only way to implement certain optimizations under existing SQL implementations is to denormalize your physical table design (itself not a bad thing) which, in those implementations, requires denormalizing your logical table design.

永恒的答案是你应该总是规范化你的逻辑表,但实际的答案是复杂的,因为在现有 SQL 实现下实现某些优化的唯一方法是非规范化你的物理表设计(本身不是一件坏事),在这些实现中,需要对逻辑表设计进行非规范化。

In short, it depends. Sometimes denormalization is important for performance, but like everything else performance-related you should measure, measure, measure before you even consider going down this route.

简而言之,这取决于。有时,非规范化对性能很重要,但与其他所有与性能相关的事物一样,在考虑走这条路之前,您应该衡量、衡量、再衡量。

回答by WolfmanDragon

Performance is inverse to the amount of normalization done on RDBMS. That being said, the more normal the tables are, the less likelihood there is for errors. There is a point to where a RDBMS performance can be hurt by denormalization, at the point to where all of the data is held in one table.

性能与在 RDBMS 上完成的标准化量成反比。也就是说,表格越正常,出错的可能性就越小。在所有数据都保存在一个表中的点上,非规范化可能会损害 RDBMS 性能。

回答by Kyle Cronin

The reason why normalization has been known to hurt performance is because joins are fairly expensive. If there are N records in table X and M records in table Y, then a join of X and Y creates a temporary table with as many as N*M records. Though there are optimization tricks that the database uses to not generate the entire table if it's not needed, it nevertheless has to process all the records.

众所周知,规范化会损害性能的原因是连接相当昂贵。如果表 X 中有 N 条记录,表 Y 中有 M 条记录,则 X 和 Y 的连接会创建一个包含多达 N*M 条记录的临时表。尽管数据库使用了一些优化技巧来在不需要时不生成整个表,但它仍然必须处理所有记录。

Denormalization is the process whereby you put data often used together in a single table to increase performance, at the sake of some database purity. Most find it to be an acceptable trade, even going so far as to design the schema intentionally denormalized to skip the intermediary step.

非规范化是为了某种数据库纯度,将经常使用的数据放在一个表中以提高性能的过程。大多数人认为这是一种可以接受的交易,甚至故意设计非规范化模式以跳过中间步骤。