Mysql 在 600 万行表上的性能

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1114619/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 13:39:33  来源:igfitidea点击:

Mysql performance on 6 million row table

mysqlperformanceindexing

提问by pedalpete

One day I suspect I'll have to learn hadoop and transfer all this data to a non-structured database, but I'm surprised to find the performance degrade so significantly in such a short period of time.

有一天我怀疑我必须学习 hadoop 并将所有这些数据传输到非结构化数据库,但我惊讶地发现性能在如此短的时间内显着下降。

I have a mysql table with just under 6 million rows. I am doing a very simple query on this table, and believe I have all the correct indexes in place.

我有一个不到 600 万行的 mysql 表。我正在对这个表做一个非常简单的查询,并且相信我有所有正确的索引。

the query is

查询是

SELECT date, time FROM events WHERE venid='47975' AND date>='2009-07-11' ORDER BY date

the explain returns

解释返回

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  updateshows     range   date_idx    date_idx    7   NULL    648997  Using where

so i am using the correct index as far as I can tell, but this query is taking 11 seconds to run.

所以据我所知,我使用了正确的索引,但是这个查询需要 11 秒才能运行。

The database is MyISAM, and phpMyAdmin says the table is 1.0GiB.

数据库是 MyISAM,phpMyAdmin 说表是 1.0GiB。

Any ideas here?

这里有什么想法吗?

Edited: The date_idx is indexes both the date and venid columns. Should those be two seperate indexes?

编辑: date_idx 是 date 和 venid 列的索引。这些应该是两个单独的索引吗?

回答by PatrikAkerstrand

What you want to make sure is that the query will use ONLY the index, so make sure that the index covers all the fields you are selecting. Also, since it is a range query involved, You need to have the venid first in the index, since it is queried as a constant. I would therefore create and index like so:

您要确保查询将仅使用索引,因此请确保索引涵盖您选择的所有字段。此外,由于它涉及范围查询,因此您需要首先在索引中使用 venid,因为它是作为常量进行查询的。因此,我会像这样创建和索引:

ALTER TABLE events ADD INDEX indexNameHere (venid, date, time);

With this index, all the information that is needed to complete the query is in the index. This means that, hopefully, the storage engine is able to fetch the information without actually seeking inside the table itself. However, MyISAM might not be able to do this, since it doesn't store the data in the leaves of the indexes, so you might not get the speed increase you desire. If that's the case, try to create a copy of the table, and use the InnoDB engine on the copy. Repeat the same steps there and see if you get a significant speed increase. InnoDB doesstore the field values in the index leaves, and allow covering indexes.

有了这个索引,完成查询所需的所有信息都在索引中。这意味着,希望存储引擎能够在不实际查找表本身的情况下获取信息。但是,MyISAM 可能无法执行此操作,因为它不会将数据存储在索引的叶子中,因此您可能无法获得所需的速度提升。如果是这种情况,请尝试创建表的副本,并在副本上使用 InnoDB 引擎。在那里重复相同的步骤,看看速度是否有显着提高。InnoDB确实将字段值存储在索引叶中,并允许覆盖索引。

Now, hopefully you'll see the following when you explain the query:

现在,希望您在解释查询时会看到以下内容:

mysql> EXPLAIN SELECT date, time FROM events WHERE venid='47975' AND date>='2009-07-11' ORDER BY date;

id  select_type table  type  possible_keys        key       [..]  Extra
1   SIMPLE   events range date_idx, indexNameHere indexNameHere   Using index, Using where

回答by Greg

Try adding a key that spans venid and date (or the other way around, or both...)

尝试添加一个跨越 venid 和 date 的密钥(或相反,或两者兼而有之......)

回答by MarkR

I would imagine that a 6M row table should be able to be optimised with quite normal techniques.

我想应该能够使用非常正常的技术优化 6M 行表。

I assume that you have a dedicated database server, and it has a sensible amount of ram (say 8G minimum).

我假设你有一个专用的数据库服务器,它有一个合理的内存量(比如最小 8G)。

You will want to ensure you've tuned mysql to use your ram efficiently. If you're running a 32-bit OS, don't. If you are using MyISAM, tune your key buffer to use a signficiant proportion, but not too much, of your ram.

您需要确保您已经调整了 mysql 以有效地使用您的 ram。如果您运行的是 32 位操作系统,请不要这样做。如果您正在使用 MyISAM,请调整您的密钥缓冲区以使用您的 ram 的重要比例,但不要太多。

In any case you want to run repeated performance testing on production-grade hardware.

在任何情况下,您都希望在生产级硬件上运行重复的性能测试。

回答by Lucas Jones

Try putting an index on the venidcolumn.

尝试在venid列上放置索引。