MySQL SQL:“NOT IN”子查询优化或替代方案

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3645794/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 17:04:24  来源:igfitidea点击:

SQL: "NOT IN" subquery optimization or alternatives

mysqlsubquery

提问by krn

I have two database tables: "places" and "translations". The translations of places names are made by selecting records from "places", which don't have the translations to the specified language yet:

我有两个数据库表:“地点”和“翻译”。地名的翻译是通过从“地方”中选择记录来进行的,这些记录还没有指定语言的翻译:

SELECT `id`, `name`
FROM `places`
WHERE `id` NOT IN (SELECT `place_id` FROM `translations` WHERE `lang` = 'en')

This worked fine with 7 000 records of places, but crashed when the number of translations reached 5 000. Since then, the query takes about 10 seconds and returns the error:

这对 7 000 条地点记录运行良好,但在翻译数量达到 5 000 时崩溃。从那时起,查询大约需要 10 秒并返回错误:

2006 - MySQL server has gone away

2006 - MySQL 服务器消失了

As I understand, the main problem here is the subquery returning to many results, bu how could I solve it, if I need to select all the places which are not translated yet?

据我了解,这里的主要问题是子查询返回许多结果,但是如果我需要选择所有尚未翻译的地方,我该如何解决?

My plan B is to create a new boolean field in "places" table, called "translated", and reset it to "false", each time I change language - that would prevent for having subquery. However, maybe I could just modify my current SQL statement and prevent from adding additional field?

我的计划 B 是在“places”表中创建一个新的布尔字段,称为“translated”,并在每次更改语言时将其重置为“false”——这将阻止子查询。但是,也许我可以修改我当前的 SQL 语句并防止添加其他字段?

回答by Tomalak

The obvious alternative:

明显的替代方案:

SELECT
  `id`, `name`
FROM
  `places`
WHERE 
  NOT EXISTS (
    SELECT 1 FROM `translations` WHERE `id` = `places`.`id` AND `lang` = 'en'
  )

There should be a clustered composite index over (translations.id, translations.lang)(composite means: a single index over multiple fields, clustered means: the index governs how the table is sorted).

应该有一个聚集的复合索引(translations.id, translations.lang)(复合意味着:多个字段上的单个索引,聚集意味着:索引控制表的排序方式)。

回答by Diogo Alves

In this case, I believe that the best alternative is to do 2 separate queries. Storing the result of the first in a variable and using the second.

在这种情况下,我认为最好的选择是进行 2 个单独的查询。将第一个的结果存储在变量中并使用第二个。