在一个巨大的 MySQL 生产表上创建一个没有表锁定的索引
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4244685/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Create an index on a huge MySQL production table without table locking
提问by n0cturnal
I need to create an index on a ~5M rows MySQL table. It is a production table, and I fear a complete block of everything if I run a CREATE INDEX statement...
我需要在约 5M 行的 MySQL 表上创建一个索引。它是一个生产表,如果我运行 CREATE INDEX 语句,我担心所有内容都会被完全阻塞...
Is there a way to create that index without blocking inserts and selects?
有没有办法在不阻塞插入和选择的情况下创建该索引?
Just wondering I have not to stop, create index and restart my system!
只是想知道我不必停止,创建索引并重新启动我的系统!
回答by Dave Dopson
[2017] Update: MySQL 5.6 has support for online index updates
[2017] 更新:MySQL 5.6 支持在线索引更新
In MySQL 5.6 and higher, the table remains available for read and write operations while the index is being created or dropped. The CREATE INDEX or DROP INDEX statement only finishes after all transactions that are accessing the table are completed, so that the initial state of the index reflects the most recent contents of the table. Previously, modifying the table while an index is being created or dropped typically resulted in a deadlock that cancelled the INSERT, UPDATE, or DELETE statement on the table.
在 MySQL 5.6 及更高版本中,在创建或删除索引时,该表仍可用于读取和写入操作。CREATE INDEX 或 DROP INDEX 语句仅在访问表的所有事务完成后才完成,以便索引的初始状态反映表的最新内容。以前,在创建或删除索引时修改表通常会导致死锁,从而取消表上的 INSERT、UPDATE 或 DELETE 语句。
[2015] Updating table indicies blocks writes in MySQL 5.5
[2015] 在 MySQL 5.5 中更新表索引块写入
From the answer above:
从上面的答案:
"If your using a version greater than 5.1 indices are created while the database is online. So not to worry you won't interrupt production system use."
“如果您使用的版本大于 5.1 索引是在数据库在线时创建的。所以不用担心您不会中断生产系统的使用。”
This is ****FALSE****(at least for MyISAM / InnoDB tables, which is what 99.999% of people out there use. Clustered Edition is different.)
这是 ****FALSE****(至少对于 MyISAM/InnoDB 表,这是 99.999% 的人使用的。集群版是不同的。)
Doing UPDATE operations on a table will BLOCKwhile the index is being created. MySQL is really, really stupid about this (and a few other things).
在创建索引时对表执行 UPDATE 操作将阻塞。MySQL 在这方面(以及其他一些事情)真的非常非常愚蠢。
Test Script:
测试脚本:
(
for n in {1..50}; do
#(time mysql -uroot -e 'select * from website_development.users where id = 41225\G'>/dev/null) 2>&1 | grep real;
(time mysql -uroot -e 'update website_development.users set bio="" where id = 41225\G'>/dev/null) 2>&1 | grep real;
done
) | cat -n &
PID=$!
sleep 0.05
echo "Index Update - START"
mysql -uroot website_development -e 'alter table users add index ddopsonfu (last_name, email, first_name, confirmation_token, current_sign_in_ip);'
echo "Index Update - FINISH"
sleep 0.05
kill $PID
time mysql -uroot website_development -e 'drop index ddopsonfu on users;'
My Server (InnoDB):
我的服务器(InnoDB):
Server version: 5.5.25a Source distribution
Output (notice how the 6th operation blocks for the ~400ms it takes to finish the index update):
输出(注意第 6 个操作是如何在大约 400 毫秒内阻塞完成索引更新的):
1 real 0m0.009s
2 real 0m0.009s
3 real 0m0.009s
4 real 0m0.012s
5 real 0m0.009s
Index Update - START
Index Update - FINISH
6 real 0m0.388s
7 real 0m0.009s
8 real 0m0.009s
9 real 0m0.009s
10 real 0m0.009s
11 real 0m0.009s
Vs read operations which don't block (swap the line comment in the script):
与不阻塞的读取操作(交换脚本中的行注释):
1 real 0m0.010s
2 real 0m0.009s
3 real 0m0.009s
4 real 0m0.010s
5 real 0m0.009s
Index Update - START
6 real 0m0.010s
7 real 0m0.010s
8 real 0m0.011s
9 real 0m0.010s
...
41 real 0m0.009s
42 real 0m0.010s
43 real 0m0.009s
Index Update - FINISH
44 real 0m0.012s
45 real 0m0.009s
46 real 0m0.009s
47 real 0m0.010s
48 real 0m0.009s
Updating MySQL's Schema without downtime
无需停机即可更新 MySQL 的架构
Thusfar, there's only one method I know of to update a MySql schema and not suffer an availability outage. Circular masters:
到目前为止,我只知道一种方法来更新 MySql 架构并且不会遭受可用性中断。循环大师:
- Master A has your MySQL database running on it
- Bring Master B into service and have it replicate writes from Master A ( B is a slave of A)
- Perform the schema update on Master B. It will fall behind during the upgrade
- Let Master B catch up. Invariant: Your schema change MUST be capable of processing commands replicated from a downversion schema. Indexing changes qualify. Simple column additions usually qualify. Removing a column? probably not.
- ATOMICALLY swap all clients from Master A to Master B. If you want to be safe (trust me, you do), you should ensure that the last write to A is replicated to B BEFOREB takes its first write. If you allow concurrent writes to 2+ masters, ... you better understand MySQL replication at a DEEP level or you are headed for a world of pain. Extreme pain. Like, do you have a column that is AUTOINCREMENT??? you are screwed (unless you use even numbers on one master and odds on the other). Do NOT trust MySQL replication to "do the right thing". It is NOT smart and will not save you. It's just slightly less safe than copying binary transaction logs from the command-line and replaying them by hand. Still, disconnecting all clients from the old master and flipping them to the new master can be done in a matter of seconds, vastly faster than waiting for a multi-hour schema upgrade.
- Now Master B is your new master. You have the new schema. Life is good. Have a beer; the worst is over.
- Repeat the process with Master A, upgrading his schema so that he becomes your new secondary master, ready to take over in the event that your primary master (master B now) loses power or just up and dies on you.
- 主 A 上运行您的 MySQL 数据库
- 将 Master B 投入使用并让它从 Master A 复制写入(B 是 A 的从属)
- 在Master B上执行schema更新,升级过程中会落后
- 让B大师赶上。不变:您的架构更改必须能够处理从向下版本架构复制的命令。索引更改符合条件。简单的列添加通常符合条件。删除列?可能不是。
- 原子地将所有客户端从主 A 交换到主 B。如果你想安全(相信我,你这样做),你应该确保对 A 的最后一次写入被复制到 B BEFOREB 进行第一次写入。如果您允许并发写入 2 个以上的 master,...您最好在 DEEP 级别了解 MySQL 复制,否则您将进入一个痛苦的世界。极度痛苦。比如,你有一个 AUTOINCREMENT 列吗???你被搞砸了(除非你在一个主上使用偶数而在另一个上使用赔率)。不要相信 MySQL 复制会“做正确的事”。它不聪明,不会拯救你。与从命令行复制二进制事务日志并手动重放它们相比,它的安全性稍差。尽管如此,将所有客户端与旧主服务器断开连接并将它们翻转到新主服务器可以在几秒钟内完成,这比等待几个小时的架构升级要快得多。
- 现在Master B 是你的新主人。你有新的架构。生活很好。喝杯啤酒;最糟糕的已经过去了。
- 对主人 A 重复这个过程,升级他的模式,让他成为你新的辅助主人,准备好在你的主要主人(现在是主人 B)失去权力或刚刚死在你身上时接管。
An easy way to update schema this isn't. Workable in a serious production environment; yes, it is. Please, please, please, if there is an easier way to add an index to a MySQL table without blocking writes, let me know.
这不是更新架构的简单方法。可在严酷的生产环境中工作;是的。请,请,请,如果有更简单的方法可以在不阻止写入的情况下向 MySQL 表添加索引,请告诉我。
Googling lead me to this articlewhich describes a similar technique. Even better, they advise drinking at the same point in the proceedure (Note that I wrote my answer before reading the article)!
谷歌搜索让我找到了这篇文章,它描述了一种类似的技术。更好的是,他们建议在程序中的同一点喝酒(请注意,我在阅读文章之前写了我的答案)!
Percona's pt-online-schema-change
Percona 的 pt-online-schema-change
The articleI linked above talks about a tool, pt-online-schema-change, that works as follows:
我上面链接的文章讨论了一个工具pt-online-schema-change,它的工作原理如下:
- Create new table with same structure as original.
- Update schema on new table.
- Add a trigger on the original table so that changes are kept in-sync with the copy
- Copy rows in batches from original table.
- Move original table out of the way and replace with new table.
- Drop old table.
- 创建与原始结构相同的新表。
- 在新表上更新架构。
- 在原始表上添加触发器,以便更改与副本保持同步
- 从原始表中批量复制行。
- 把原来的桌子移开,换上新桌子。
- 放下旧桌子。
I've never tried the tool myself. YMMV
我自己从未尝试过该工具。青年会
RDS
RDS
I'm currently using MySQL through Amazon's RDS. It's a really nifty service that wraps up and manages MySQL, letting you add new read replicas with a single button and transparently upgrade the database across hardware SKU's. It's really convenient. You don't get SUPER access to the database, so you can't screw with replication directly (is this a blessing or curse?). However, you can use Read Replica Promotionto make your schema changes on a read-only slave, then promote that slave to become your new master. Exactly the same trick as I described above, just vastly easier to execute. They still don't do much to help you with the cut-over. You have to reconfigure and restart your app.
我目前正在通过Amazon 的 RDS使用 MySQL 。这是一个非常漂亮的服务,它包含和管理 MySQL,让您只需一个按钮即可添加新的只读副本,并跨硬件 SKU 透明地升级数据库。真的很方便。您无法获得对数据库的超级访问权限,因此您无法直接进行复制(这是福还是祸?)。但是,您可以使用只读副本提升在只读从站上更改架构,然后将该从站提升为您的新主站。与我上面描述的技巧完全相同,只是更容易执行。他们仍然没有做太多帮助您进行切换。您必须重新配置并重新启动您的应用程序。
回答by Drew
As this blog postoutlines, the InnoDB ALTER TABLE
mechanism has been completely redesigned for MySQL 5.6.
正如这篇博文所述,InnoDBALTER TABLE
机制已经针对 MySQL 5.6 进行了完全重新设计。
(For an exclusive overview of this topic, the MySQL documentationcan provide an afternoon's worth of reading.)
(有关此主题的独家概述,MySQL 文档可以提供一个下午的阅读时间。)
To add an index to a table without a lockresulting on UPDATE
/ INSERT
, the following statement format can be used:
要将索引添加到表而不导致UPDATE
/锁定INSERT
,可以使用以下语句格式:
ALTER TABLE my_table ADD INDEX my_table__idx (my_column), ALGORITHM=INPLACE, LOCK=NONE;
回答by Eric Saboia
MySQL 5.6 update (feb 2013): You can now perform read and write operations while an index is being created even with InnoDB tables - http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html
MySQL 5.6 更新(2013 年 2 月):即使使用 InnoDB 表,您现在也可以在创建索引时执行读写操作 - http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index -overview.html
In MySQL 5.6 and higher, the table remains available for read and write operations while the index is being created or dropped. The CREATE INDEX or DROP INDEX statement only finishes after all transactions that are accessing the table are completed, so that the initial state of the index reflects the most recent contents of the table. Previously, modifying the table while an index is being created or dropped typically resulted in a deadlock that cancelled the INSERT, UPDATE, or DELETE statement on the table.
在 MySQL 5.6 及更高版本中,在创建或删除索引时,该表仍可用于读取和写入操作。CREATE INDEX 或 DROP INDEX 语句仅在访问表的所有事务完成后才完成,以便索引的初始状态反映表的最新内容。以前,在创建或删除索引时修改表通常会导致死锁,从而取消表上的 INSERT、UPDATE 或 DELETE 语句。
and:
和:
In MySQL 5.6, this feature becomes more general: you can read and write to tables while an index is being created, and many more kinds of ALTER TABLE operations can be performed without copying the table, without blocking DML operations, or both. Thus in MySQL 5.6 and higher, we typically refer to this set of features as online DDL rather than Fast Index Creation.
在 MySQL 5.6 中,此功能变得更加通用:您可以在创建索引的同时读取和写入表,并且可以在不复制表、不阻塞 DML 操作或两者的情况下执行更多种类的 ALTER TABLE 操作。因此,在 MySQL 5.6 及更高版本中,我们通常将这组功能称为在线 DDL,而不是快速索引创建。
from http://dev.mysql.com/doc/refman/5.6/en/glossary.html#glos_fast_index_creation
来自http://dev.mysql.com/doc/refman/5.6/en/glossary.html#glos_fast_index_creation
回答by Alex Le
pt-online-schema-change is the way to go if you really want to make sure that the migration will not bring down the site.
pt-online-schema-change 如果您真的想确保迁移不会导致站点瘫痪,那么您可以使用 pt-online-schema-change 。
As I wrote in the above comment, I have several experiences with pt-online-schema-change in production. We have our main table of 20M+ records and a master -> 2 read-only replication slaves. I've done at least a dozens of migrations with pt-online-schema-change from adding a new column, changing charset, to adding several indices. We serve tons of traffic during the migration time as well and we have not had any hiccup. Of course you'd have to test all the scripts very thoroughly before running on production.
正如我在上面的评论中所写的那样,我在生产中使用 pt-online-schema-change 有一些经验。我们有 2000 万条记录的主表和一个 master -> 2 个只读复制从属。我已经使用 pt-online-schema-change 进行了至少数十次迁移,从添加新列、更改字符集到添加多个索引。我们在迁移期间也提供大量流量,而且我们没有遇到任何问题。当然,在生产环境中运行之前,您必须非常彻底地测试所有脚本。
I tried to batch up the changes into 1 script so that pt-online-schema-change only have to copy the data once. And be very careful with changing column name since you will loose your data. However, adding an index should be fine.
我尝试将更改批量合并到 1 个脚本中,以便 pt-online-schema-change 只需复制数据一次。更改列名时要非常小心,因为您会丢失数据。但是,添加索引应该没问题。