MySQL:永久获得“等待表元数据锁定”

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19801139/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 19:20:26  来源:igfitidea点击:

MySQL : Permanently getting " Waiting for table metadata lock"

mysqlprocessblocking

提问by toom

My MySQL database serves three webapps as the storage backend. However I recently encounter permanantly the error "Waiting for table metadata lock". It happen nearly all the time and I do not understand why.

我的 MySQL 数据库提供三个 webapp 作为存储后端。但是我最近经常遇到错误“等待表元数据锁定”。它几乎一直发生,我不明白为什么。

mysql> show processlist
    -> ;
+------+-----------+-----------------+------------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------+
| Id   | User      | Host            | db         | Command | Time | State                           | Info                                                                                                 |
+------+-----------+-----------------+------------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------+
|   36 | root      | localhost:33444 | bookmaker2 | Sleep   |  139 |                                 | NULL                                                                                                 |
|   37 | root      | localhost:33445 | bookmaker2 | Sleep   |  139 |                                 | NULL                                                                                                 |
|   38 | root      | localhost:33446 | bookmaker2 | Sleep   |  139 |                                 | NULL                                                                                                 |
|   39 | root      | localhost:33447 | bookmaker2 | Sleep   |   49 |                                 | NULL                                                                                                 |
|   40 | root      | localhost:33448 | bookmaker2 | Sleep   |  139 |                                 | NULL                                                                                                 |
| 1315 | bookmaker | localhost:34869 | bookmaker  | Sleep   |   58 |                                 | NULL                                                                                                 |
| 1316 | root      | localhost:34874 | bookmaker3 | Sleep   |   56 |                                 | NULL                                                                                                 |
| 1395 | bookmaker | localhost:34953 | bookmaker  | Sleep   |   58 |                                 | NULL                                                                                                 |
| 1396 | root      | localhost:34954 | bookmaker3 | Sleep   |   46 |                                 | NULL                                                                                                 |
| 1398 | root      | localhost:34956 | bookmaker3 | Query   |   28 | Waiting for table metadata lock | CREATE TABLE IF NOT EXISTS LogEntries  ( 
                    lid         INT NOT NULL AUTO_INCREMEN |
| 1399 | root      | localhost       | NULL       | Query   |    0 | NULL                            | show processlist                                                                                     |
+------+-----------+-----------------+------------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------+

Of course one can kill the corresponding process. However if I restart my program that tries to create the table structure of my database "bookmaker3" the newly created process ends up again in a metalock.

当然可以杀掉相应的进程。但是,如果我重新启动试图创建我的数据库“bookmaker3”的表结构的程序,新创建的进程将再次以 Metalock 结束。

I even cannot drop the database:

我什至无法删除数据库:

mysql> drop database bookmaker3;

This yields also a metalock.

这也产生了金属。

How can this be repaired?

如何修复?

回答by michalczukm

Kill the connection with lock

用锁杀死连接

Kill 1398

Then check if you have autocommit set to 0 by

然后检查您是否已通过以下方式将自动提交设置为 0

select @@autocommit;

If yes, you propably forgot to commit transaction. Then another connection want to do something with this table, which causes the lock.

如果是,您可能忘记提交事务。然后另一个连接想要对这个表做一些事情,从而导致锁定。

In your case: If you made some query to LogEntries(whitch exists) and did not commit it, then you try to execute CREATE TABLE IF NOT EXISTS from another connection - metadata lock happens.

在您的情况下:如果您对LogEntries进行了一些查询(其中存在)并且没有提交它,那么您尝试从另一个连接执行 CREATE TABLE IF NOT EXISTS - 元数据锁定发生。

editFor me the bug is somewhere at your application. Check there, or set autocommit to 1if your not using transactions in application.

编辑对我来说,错误在您的应用程序中。检查那里,或者如果您不在应用程序中使用事务,则将自动提交设置为 1

ps also check this posts:

ps还检查这个帖子:

回答by Alexander V. Ilyin

In case if you have HS plugin and trying to CREATEor ALTERtable which was already attempted to assess via HS you will face with similar problem and you have to restart HS plugin in this way to release table metadata lock:

如果您有 HS 插件并尝试CREATEALTER已尝试通过 HS 评估的表,您将面临类似的问题,您必须以这种方式重新启动 HS 插件以释放表元数据锁:

UNINSTALL PLUGIN HANDLERSOCKET;
INSTALL PLUGIN HANDLERSOCKET SONAME 'handlersocket.so';

回答by LSerni

The accepted solution is, unfortunately, wrong. It is right as far as it says,

不幸的是,已接受的解决方案是错误的。正如它所说的那样,它是正确的,

Kill the connection with lock

用锁杀死连接

This is indeed what to do. But then it suggests,

这确实是应该做的。但它表明,

Kill 1398
Kill 1398

...and 1398 is notthe connection with the lock. How could it be? It is the connection waiting for the lock. This means it does not yet havethe lock! Killing it avails nothing (it actually might solve part of anotherproblem, but only temporarily).

...和1398不是与锁的连接。怎么会这样?它是等待锁的连接。这意味着它还没有锁定!杀死它无济于事(它实际上可能会解决另一个问题的一部分,但只是暂时的)。

The real cause is that another process is holding the lock, and more importantly, SHOW FULL PROCESSLISTwill not tell you which it is.

真正的原因是另一个进程持有锁,更重要的是,SHOW FULL PROCESSLIST不会告诉你它是哪个。

It WILLtell you if the process is doingsomething, yes. Usually it works. Here, the process holding the lock is doing nothing, and hides among other threads also doing nothing.

WILL告诉你,如果过程是什么的,是的。通常它有效。在这里,持有锁的进程什么都不做,隐藏在其他线程中也什么都不做。

Quick and dirty solution, not really recommended

快速而肮脏的解决方案,不推荐

Kill allprocesses in "Sleep" state, on the same database, that are older than the oldestthread in "waiting for metadata lock" state. This is what Arnaud Amaurywould have done:

杀死同一数据库上处于“睡眠”状态的所有进程,这些进程早于处于“等待元数据锁定”状态的最旧线程。这就是Arnaud Amaury会做的:

  • for each database that has at least one thread in WFL:
    • the oldest connection in WFL on that DB turns out to be Z seconds old
    • ALL the "Sleep" threads on that DB and older than Z must go.
  • 对于 WFL 中至少有一个线程的每个数据库:
    • 该数据库上 WFL 中最旧的连接原来是 Z 秒
    • 该数据库上的所有“睡眠”线程和比 Z 更旧的线程都必须运行。

More focused fix

更集中的修复

Run SHOW ENGINE INNODB STATUSand look at the "TRANSACTION" section. You will find, among others, something like

运行SHOW ENGINE INNODB STATUS并查看“交易”部分。你会发现,除其他外,像

TRANSACTION 1701, ACTIVE 58 sec;2 lock struct(s), heap size 376, 1 row lock(s), undo log entries 1
MySQL thread id 1396, OS thread handle 0x7fd06d675700, query id 1138 hostname 1.2.3.4 whatever;

Now you check with SHOW FULL PROCESSLISTwhat is thread id 1396 doing. Chances are it is in "Sleep" status. So: an active transaction with an active lock, it has even done some changes as it has an undo log entry... but is currently idle. Thatand no other is the thread you need to kill.

现在您检查SHOW FULL PROCESSLIST线程 ID 1396 在做什么。它可能处于“睡眠”状态。所以:一个带有活动锁的活动事务,它甚至做了一些更改,因为它有一个撤消日志条目......但目前处于空闲状态。就是您需要杀死的线程。

Actually if you are on Linux and have root privileges, there's a way to find out which processhas the connection that requested the lock. This in turn allows to determine (from CPU usage or, at worst, strace -ff -p pid) whether that process is really doing something or not, to help decide if it's safe to kill.

实际上,如果您在 Linux 上并且拥有 root 权限,则有一种方法可以找出哪个进程具有请求锁定的连接。这反过来又允许(从 CPU 使用率,或者最坏的情况下strace -ff -p pid)确定该进程是否真的在做某事,以帮助确定杀死它是否安全。

I see this happening with webapps that use "persistent" or "pooled" MySQL connections, which nowadays usually save very little time: the webapp instance terminated, but the connection did not, so its lock is still alive and blocking everyone else.

我看到使用“持久”或“池化”MySQL 连接的 webapps 会发生这种情况,现在通常节省的时间很少:webapp 实例终止,但连接没有,所以它的锁仍然存在并阻止其他所有人。

Another way to get the culprit if you have a recent MySQL, but not too recentsince this is going to be deprecated, is (you need privileges again on the information schema)

如果您有最近的 MySQL,但不是太近,因为这将被弃用,另一种找到罪魁祸首的方法是(您再次需要对信息模式的特权)

SELECT * FROM INFORMATION_SCHEMA.INNODB_LOCKS 
     WHERE LOCK_TRX_ID IN 
        (SELECT BLOCKING_TRX_ID FROM INFORMATION_SCHEMA.INNODB_LOCK_WAITS);

Actual solution

实际解决方案

The problem is usually caused by this architecture:

问题通常是由这种架构引起的:

webapp (jar, php) --> container (mysqldb, php_module) --> web server --> MySQL

When the webapp dies, or the webapp lightweight thread instance dies, the container does not. And it is the container that keeps the connection open, so the connection does not close. Quite predictably, MySQL does not consider the operation complete.

当 webapp 挂掉,或者 webapp 轻量级线程实例挂掉时,容器不会。并且是容器使连接保持打开状态,因此连接不会关闭。可以预见的是,MySQL 并不认为操作完成。

If the webapp didn't clean after itself (no ROLLBACKor COMMITfor a transaction, no UNLOCK TABLES, etc.), then whatever that webapp started doing is still extant, and might still be blocking everyone else.

如果 webapp 没有自行清理(noROLLBACKCOMMITfor a transaction,noUNLOCK TABLES等),那么 webapp 开始做的任何事情仍然存在,并且可能仍然会阻止其他所有人。

There are then two solutions. The worse one is to lower the idle timeout. But guess what happens if you wait too long between two queries (exactly: "MySQL server has gone away". Butyou might check for thatand reopen the connection, so - for a small performance fee - it's doable).

那么有两种解决方案。更糟糕的是降低空闲超时。但是想如果你等待两个查询之间的时间过长会发生什么(确切地说:“MySQL服务器已经走了。”不过,你可能会检查和重新打开连接,所以-一个小的演出费-这是可行的)。

The better, smarter solution is less straightforward to implement. Endeavour to have the script clean after itself, ensuring to catch all exception and deal with them properly, or, if possible, skip persistent connections altogether. Let each instance create its own connection or use a smart pool driver(in PHP PDO, use PDO::ATTR_PERSISTENTexplicitly set to false). Alternatively (e.g. in PHP) you can have destruct and exception handlers force clean the connection by committing or rolling back transactions and issuing explicit table unlocks.

更好、更智能的解决方案不太容易实施。努力让脚本自己清理干净,确保捕获所有异常并正确处理它们,或者,如果可能,完全跳过持久连接。让每个实例创建自己的连接或使用智能池驱动程序(在 PHP PDO 中,使用PDO::ATTR_PERSISTENT显式设置为false)。或者(例如在 PHP 中),您可以通过提交或回滚事务并发出显式表解锁来让析构和异常处理程序强制清理连接。