sql 查询连接多个表 - 太慢(8 个表)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/806015/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 01:53:54  来源:igfitidea点击:

sql query joins multiple tables - too slow (8 tables)

sqlperformancejoin

提问by

i'm trying to join 8 tables into one in order to create index used by other application, my query is like : (my mysql skill's very amateur)

我正在尝试将 8 个表合并为一个以创建其他应用程序使用的索引,我的查询如下:(我的 mysql 技能非常业余)

SELECT t1_id, t2_name, t3_name, t4_name, t5_name, 
       t6_name, t7_name, t8_name, t9_name 
FROM t1 
  LEFT JOIN t2 ON (t1_id = t2_id) 
  LEFT JOIN t3 ON (t3_id = t1_id) 
  LEFT JOIN t4 ON (t4_id = t1_id)
  LEFT JOIN t5 ON (t5_id = t1_id)
  LEFT JOIN t6 ON (t6_id = t1_id) 
  LEFT JOIN t7 ON (t7_id = t1_id)
  LEFT JOIN t8 ON (t8_id = t1_id)
  LEFT JOIN t9 ON (t9_id = t1_id)

i can't even see the query results when i executing it, any ways to speed it up? :) any kinds of help is appreciated, but it's better be only one query (outside application rules)

当我执行它时,我什至看不到查询结果,有什么方法可以加快速度吗?:) 任何类型的帮助表示赞赏,但最好只有一个查询(应用程序规则之外)

thanks in advance

提前致谢

回答by

I had a similar problem with several lookup tables joining to a large table with all id fields indexed. To monitor the effect of the joins on query time execution, I ran my query several times (limiting to first 100 rows), adding a Join to an additional table each time. After joining 12 tables, there was no significant change in query execution time. By the time I had joined the 13th table the execution time jumped to a 1 second; 14th table 4 seconds, 15th table 20 s, 16th 90 seconds.

我有几个查找表连接到一个所有 id 字段都已索引的大表的类似问题。为了监控连接对查询时间执行的影响,我多次运行我的查询(限制为前 100 行),每次都向附加表添加一个连接。加入12张表后,查询执行时间没有明显变化。当我加入第 13 个表时,执行时间跃升至 1 秒;第14桌4秒,第15桌20秒,第16桌90秒。

Keijro's suggestion to use a correlated subqueries instead of joins e.g.

Keijro 建议使用相关子查询而不是连接,例如

SELECT t1_id, 
        (select t2_name from t2 where t1_id = t2_id), 
        (select t3_name from t3 where t1_id = t3_id), 
        (select t4_name from t4 where t1_id = t4_id), 
        (select t5_name from t5 where t1_id = t5_id), 
        (select t6_name from t6 where t1_id = t6_id), 
        (select t7_name from t7 where t1_id = t7_id), 
        (select t8_name from t8 where t1_id = t8_id), 
        (select t9_name from t9 where t1_id = t9_id)  FROM t1

improved query performance dramatically. In fact the subqueries did not seem to lengthen the time to execute the query (the query was almost instanteous).

显着提高了查询性能。事实上,子查询似乎并没有延长执行查询的时间(查询几乎是即时的)。

I am a little suprised as I thought correlated subqueries perform worse than joins.

我有点惊讶,因为我认为相关子查询的性能比连接差。

回答by Nathan Ridley

Depending on how much data is in the tables, you may need to place indexes on the columns that are being joined against. Often slow querying speed comes down to lack of an index in the right place.

根据表中有多少数据,您可能需要在要连接的列上放置索引。缓慢的查询速度通常归结为在正确的位置缺少索引。

Also:

还:

LEFT JOINs are slower than INNER JOINs (though this is dependent on what you're doing exactly) - can you accomplish what you're looking for with inner joins?

LEFT JOINs 比 INNER JOINs 慢(尽管这取决于你正在做什么) - 你能用内部联接完成你正在寻找的东西吗?

回答by Jimmy Stenke

It would help a bit if you could post the explain plan of the query.

如果您可以发布查询的解释计划,那会有所帮助。

But, first of all, you have indexes on all the fields used in the join? something like CREATE INDEX ix_t2_id on t2 (t2_id, t2_name);

但是,首先,您对连接中使用的所有字段都有索引吗?就像是CREATE INDEX ix_t2_id on t2 (t2_id, t2_name);

Instead of the joins you could do something like

而不是连接,你可以做类似的事情

SELECT t1_id, 
    (select t2_name from t2 where t1_id = t2_id), 
    (select t3_name from t3 where t1_id = t3_id), 
    (select t4_name from t4 where t1_id = t4_id), 
    (select t5_name from t5 where t1_id = t5_id), 
    (select t6_name from t6 where t1_id = t6_id), 
    (select t7_name from t7 where t1_id = t7_id), 
    (select t8_name from t8 where t1_id = t8_id), 
    (select t9_name from t9 where t1_id = t9_id) 
FROM t1 

But, with a good query planner, that shouldn't differ from the joins.

但是,有了一个好的查询规划器,这应该与连接没有区别。

回答by u07ch

How much data are we talking about ? It might be you have a lot of data and as the where clause is being run at the end of the query process you are joining huge volumes of data before filtering it.

我们在谈论多少数据?可能是您有大量数据,并且在查询过程结束时运行 where 子句时,您在过滤之前加入了大量数据。

In that case its better to filter the data as soon as possible so if you can restrict the data from T1 in the first inner select all the other joins will join to a more limited set of data.

在这种情况下,最好尽快过滤数据,因此如果您可以限制第一个内部选择中来自 T1 的数据,则所有其他连接将连接到更有限的数据集。

Select <your fields> from
(
Select * from t1 where t1_id = t1_value
) t1

Inner join t2
on t1.ID = t2.ID
...

if its not masses of data; check your indexes are correct then check server type things; index fragmentation; disk queues etc.

如果不是大量数据;检查您的索引是否正确,然后检查服务器类型;索引碎片化;磁盘队列等

回答by Dries Van Hansewijck

If you need all the rows of t1, and you left join on the primary key (I guess it's also the clustered index) of the other tables, there is no way to improve the speed of the query.

如果你需要t1的所有行,并且你在其他表的主键(我猜它也是聚集索引)上留下join,则没有办法提高查询速度。

To improve performance you either need to reduce the result set or perform a nasty trick (eg make a denormalized copy of the data).

为了提高性能,您要么需要减少结果集,要么执行一个讨厌的技巧(例如,制作数据的非规范化副本)。

回答by Quassnoi

From your query plan I can conclude that the tables referred to as s, nand qdo not have an index on the field they are being joined on.

从您的查询计划中,我可以得出结论,这些表被称为sn并且q在它们所连接的字段上没有索引。

Since there are lot of rows in these tables (about 400,000rows in their cartesian product) and MySQL's only way to do JOIN's is using NESTED LOOPS, it will really take forever.

由于这些表中有很多行(关于400,000笛卡尔积中的行)并且MySQL唯一的方法JOIN是使用NESTED LOOPS,它真的需要永远。

Create an index on these tables or define the joined field as a PRIMARY KEY.

在这些表上创建索引或将连接的字段定义为PRIMARY KEY.

回答by Vikram

As i can see, t1 table is the one which is being joined with all the tables, instead of putting them in a single query with so many joins, you can possibly try a Union of different queries something like this.

正如我所看到的,t1 表是与所有表连接的表,而不是将它们放在具有如此多连接的单个查询中,您可以尝试像这样的不同查询的联合。

SELECT  t1_id, t2_name 
FROM    t1 LEFT JOIN t2 ON (t1_id = t2_id)
union 
SELECT  t1_id, t3_name 
FROM    t1 LEFT JOIN t3 ON (t1_id = t3_id)

however, in that case the result you will get will not have 8 columns but just 1 column. not sure if that is an option available with you.

但是,在这种情况下,您得到的结果将不会有 8 列,而只有 1 列。不确定这是否是您可用的选项。

one more thing, which you must in whatever solution you implement is - create appropriate index on all your tables. the best practice of index columns is to create it on the column which is most frequently used for joins or where clause.

还有一件事,无论您实施什么解决方案,您都必须这样做 - 在所有表上创建适当的索引。索引列的最佳实践是在最常用于连接或 where 子句的列上创建它。

回答by Doug L.

Depending on your version of SQL server, simply putting your query into a stored procedure may make a big difference. Try this after you have tried the other optimizations first.(Yes, I know there are cached execution plans and other internal server optimizations, but in my practical real-world experience, stored procedures can execute faster.)

根据您的 SQL Server 版本,简单地将您的查询放入存储过程可能会产生很大的不同。先尝试其他优化后再试试这个。(是的,我知道有缓存执行计划和其他内部服务器优化,但在我实际的实际经验中,存储过程可以执行得更快。)