oracle 性能:子查询或连接

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2871576/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 20:39:54  来源:igfitidea点击:

Performance: Subquery or Joining

sqloracle

提问by domiSchenk

I got a little question about performance of a subquery / joining another table

我有一个关于子查询/加入另一个表的性能的小问题

INSERT
INTO Original.Person
  (
    PID, Name, Surname, SID
  )
  (
    SELECT ma.PID_new , TBL.Name , ma.Surname, TBL.SID 
    FROM Copy.Person TBL , original.MATabelle MA
    WHERE TBL.PID         = p_PID_old
      AND TBL.PID         = MA.PID_old
  );

This is my SQL, now this thing runs around 1 million times or more. My question is what would be faster?

这是我的 SQL,现在这个东西运行了大约 100 万次或更多。我的问题是什么会更快?

  • If I change TBL.SIDto (Select new from helptable where old = tbl.sid)
  • 如果我TBL.SID改为 ( Select new from helptable where old = tbl.sid)

OR

或者

  • If I add the 'HelpTable' to the fromand do the joining in the where?
  • 如果我加入“HelpTable”的from和做的加盟where

edit1
Well, this script runs only as much as there r persons.

edit1
嗯,这个脚本只运行多少人。

My program has 2 modules one that populates MaTabelleand one that transfers data. This program does merge 2 databases together and coz of this, sometimes the same Key is used.
Now I'm working on a solution that no duplicate Keys exists.

我的程序有 2 个模块,一个用于填充MaTabelle,另一个用于传输数据。该程序确实将 2 个数据库合并在一起,因此有时使用相同的 Key。
现在我正在研究不存在重复密钥的解决方案。

My solution is to make a 'HelpTable'. The owner of the key(SID) generates a new key and writes it into a 'HelpTable'. All other tables that use this key can read it from the 'HelpTable'.

我的解决方案是制作一个“HelpTable”。key( SID)的所有者生成一个新密钥并将其写入“HelpTable”。使用此键的所有其他表都可以从“HelpTable”中读取它。

edit2
Just got something in my mind:
if a table as a Key that can be null(foreignkey that is not linked) then this won't work with the from or?

edit2
我想到了一些东西:
如果一个表作为可以为空的键(未链接的外键),那么这将不适用于 from 或?

回答by Marcus Adams

Modern RDBMs, including Oracle, optimize most joins and sub queries down to the same execution plan.

现代 RDBM,包括 Oracle,将大多数连接和子查询优化到相同的执行计划。

Therefore, I would go ahead and write your query in the way that is simplest for you and focus on ensuring that you've fully optimized your indexes.

因此,我会继续以对您来说最简单的方式编写您的查询,并专注于确保您已经完全优化了索引。

If you provide your final query and your database schema, we might be able to offer detailed suggestions, including information regarding potential locking issues.

如果您提供最终查询和数据库架构,我们可能会提供详细的建议,包括有关潜在锁定问题的信息。

Edit

编辑

Here are some general tips that apply to your query:

以下是一些适用于您的查询的一般提示:

  • For joins, ensure that you have an index on the columns that you are joining on. Be sure to apply an index to the joined columns in both tables. You might think you only need the index in one direction, but you should index both, since sometimes the database determines that it's better to join in the opposite direction.
  • For WHERE clauses, ensure that you have indexes on the columns mentioned in the WHERE.
  • For inserting many rows, it's best if you can insert them all in a single query.
  • For inserting on a table with a clustered index, it's best if you insert with incremental values for the clustered index so that the new rows are appended to the end of the data. This avoids rebuilding the index and often avoids locks on the existing records, which would slow down SELECT queries against existing rows. Basically, inserts become less painful to other users of the system.
  • 对于连接,请确保您在要连接的列上有一个索引。确保对两个表中的连接列应用索引。您可能认为只需要一个方向的索引,但您应该同时索引两个方向,因为有时数据库会确定最好在相反方向加入。
  • 对于 WHERE 子句,请确保您在 WHERE 中提到的列上有索引。
  • 要插入多行,最好将它们全部插入到单个查询中。
  • 要在具有聚集索引的表上插入,最好为聚集索引插入增量值,以便将新行附加到数据的末尾。这避免了重建索引并经常避免对现有记录的锁定,这会减慢对现有行的 SELECT 查询。基本上,插入对系统的其他用户来说变得不那么痛苦了。

回答by ovais.tariq

Joining would be much faster than a subquery

加入会比子查询快得多

回答by Space

The main difference betwen subquery and join is subquery is faster when we have to retrieve data from large number of tables.Because it becomes tedious to join more tables. join is faster to retrieve data from database when we have less number of tables.

子查询和连接之间的主要区别是当我们必须从大量表中检索数据时子查询更快。因为连接更多表变得乏味。当我们的表数量较少时,join 可以更快地从数据库中检索数据。

Also, this joins vs subquerycan give you some more info

此外,此连接与子查询可以为您提供更多信息

回答by Rob van Wijk

Instead of focussing on whether to use join or subquery, I would focus on the necessity of doing 1,000,000 executions of that particular insert statement. Especially as Oracle's optimizer -as Marcus Adams already pointed out- will optimize and rewrite your statements under the covers to its most optimal form.

我不会关注是使用连接还是子查询,而是关注执行 1,000,000 次特定插入语句的必要性。特别是因为 Oracle 的优化器 - 正如 Marcus Adams 已经指出的那样 - 将在幕后优化和重写您的语句,使其成为最佳形式。

Are you populating MaTabelle 1,000,000 times with only a few rows and issue that statement? If yes, then the answer is to do it in one shot. Can you provide some more information on your process that is executing this statement so many times?

您是否仅用几行填充 MaTabelle 1,000,000 次并发出该语句?如果是,那么答案是一次性完成。您能否提供有关多次执行此语句的进程的更多信息?

EDIT: You indicate that this insert statement is executed for every person. In that case the advice is to populate MATabelle first and then execute once:

编辑:您指出此插入语句是为每个人执行的。在这种情况下,建议先填充 MATabelle,然后执行一次:

INSERT 
INTO Original.Person 
  ( 
    PID, Name, Surname, SID 
  ) 
  ( 
    SELECT ma.PID_new , TBL.Name , ma.Surname, TBL.SID  
    FROM Copy.Person TBL , original.MATabelle MA 
    WHERE TBL.PID         = MA.PID_old 
  );

Regards, Rob.

问候,罗伯。