MYSQL“IN”的性能

Question

提问by Roy

I'm running a MYSQL query in two steps. First, I get a list of ids with one query, and then I retrieve the data for those ids using a second query along the lines of SELECT * FROM data WHERE id in (id1, id2 ...). I know it sounds hacky, but I've done it this way as the queries are very complicated; the first involves lots of geometry and triggernometry, the second one lots of different joins. I'm sure they could be written in a single query, but my MYSQL isn't good enough to pull it off.

我分两步运行 MYSQL 查询。首先，我通过一个查询获得一个 id 列表，然后我使用第二个查询检索这些 id 的数据SELECT * FROM data WHERE id in (id1, id2 ...)。我知道这听起来很hacky，但我已经这样做了，因为查询非常复杂；第一个涉及许多几何和触发器测量，第二个涉及许多不同的连接。我确信它们可以写在一个查询中，但我的 MYSQL 不够好，无法实现它。

This approach works, but it doesn't feelright; plus I'm concerned it won't scale. At the moment I am testing on a database of 10,000 records, with 400 ids in the "IN" clause ( i.e. IN (id1, id2 ... id400)) and performance is fine. But what if there are say 1,000,000 records?

这种方法有效，但感觉不对；另外我担心它不会扩展。目前，我正在测试一个包含 10,000 条记录的数据库，在“IN”子句（即IN (id1, id2 ... id400)）中有 400 个 ID，并且性能很好。但是如果有 1,000,000 条记录呢？

Where are the performance bottlenecks (speed, memory, etc) for this kind of query? Any ideas for how to refactor this kind of query for be awesome too. (for example, if it is worth swotting up on stored procedures).

这种查询的性能瓶颈（速度、内存等）在哪里？关于如何重构这种查询的任何想法也很棒。（例如，是否值得关注存储过程）。

Answer 1

采纳答案by Quassnoi

Starting from a certain number of records, the INpredicate over a SELECTbecomes faster than that over a list of constants.

从一定数量的记录开始，a 上的IN谓词SELECT变得比常量列表上的谓词更快。

See this article in my blog for performance comparison:

有关性能比较，请参阅我博客中的这篇文章：

Passing parameters in MySQL: IN list vs. temporary table

在 MySQL 中传递参数：IN 列表 vs. 临时表

If the column used in the query in the INclause is indexed, like this:

如果IN子句中查询中使用的列被索引，像这样：

SELECT  *
FROM    table1
WHERE   unindexed_column IN
        (
        SELECT  indexed_column
        FROM    table2
        )

, then this query is just optimized to an EXISTS(which uses but a one entry for each record from table1)

，那么这个查询只是优化为一个EXISTS（它只为来自的每条记录使用一个条目table1）

Unfortunately, MySQLis not capable of doing HASH SEMI JOINor MERGE SEMI JOINwhich are yet more efficient (especially if both columns are indexed).

不幸的是，MySQL不能这样做HASH SEMI JOIN或者MERGE SEMI JOIN效率更高（特别是如果两列都被索引）。

Answer 2

回答by Eric Hogue

Why do you extract the ids first? You should probably just join the tables. If you use the ids for something else, you can insert them in a temp table before and use this table for the join.

为什么要先提取id？您可能应该加入表格。如果您将 id 用于其他用途，您可以先将它们插入临时表中，然后使用该表进行连接。

MYSQL“IN”的性能

提问by Roy

采纳答案by Quassnoi

回答by Eric Hogue

相关推荐

最近更新

标签

MYSQL“IN”的性能

提问by Roy

采纳答案by Quassnoi

回答by Eric Hogue

相关推荐

如何在 MySQL 中设置初始值和自动增量？

MySQL 如何在MySQL中从第x行开始获取所有行

MySQL 会话 - 终止查询以解锁表

用于显示当前配置变量的 mysql 命令

相关推荐

最近更新

标签