MySQL 如何处理并发插入?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32087233/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 21:13:39  来源:igfitidea点击:

How does MySQL handle concurrent inserts?

mysql

提问by Kamrul Khan

I know there is one issue in MySQL with concurrent SELECT and INSERT. However, my question is if I open up two connections with MySQL and keep loading data using both of them, does MySQL takes data concurrently or waits for one to finish before loading another?

我知道 MySQL 中存在一个并发 SELECT 和 INSERT 的问题。但是,我的问题是,如果我打开与 MySQL 的两个连接并继续使用它们加载数据,MySQL 是并发获取数据还是等待一个完成后再加载另一个?

I'd like to know how MySQL behaves in both cases. Like when I am trying to load data in the same table or different tables concurrently when opening separate connections.

我想知道 MySQL 在这两种情况下的行为。就像我尝试在打开单独的连接时同时加载同一个表或不同表中的数据一样。

回答by displayName

If you will create a new connection to the database and perform inserts from both the links, then from the database's perspective, it will still be sequential.

如果您将创建一个到数据库的新连接并从两个链接执行插入,那么从数据库的角度来看,它仍然是顺序的

The documentation of Concurrent Insertson the MySQL's documentation page says:

的文档并发插入MySQL的文档页面上说:

If there are multiple INSERT statements, they are queued and performed in sequence, concurrently with the SELECT statements.

如果有多个 INSERT 语句,它们将与 SELECT 语句同时排队并按顺序执行。

Mind that there is no control over the order in which two concurrent insertswill take place. The order in this concurrency is at the mercy of a lot of different factors. To ensure order, by default you will have to sacrifice concurrency.

请注意,无法控制两个并发插入的顺序。这种并发中的顺序受许多不同因素的支配。为了确保顺序,默认情况下您将不得不牺牲并发性。

回答by seahawk

MySQL does support parallel data inserts into the same table.

MySQL 确实支持将数据并行插入到同一个表中。

But approaches for concurrent read/write depends upon storage engine you use.

但是并发读/写的方法取决于您使用的存储引擎。

InnoDB

数据库

MySQL uses row-level locking for InnoDB tables to support simultaneous write access by multiple sessions, making them suitable for multi-user, highly concurrent, and OLTP applications.

MySQL 对 InnoDB 表使用行级锁定以支持多个会话的同时写入访问,使其适用于多用户、高并发和 OLTP 应用程序。

MyISAM

我的ISAM

MySQL uses table-level locking for MyISAM, MEMORY, and MERGE tables, allowing only one session to update those tables at a time, making them more suitable for read-only, read-mostly, or single-user applications

MySQL 对 MyISAM、MEMORY 和 MERGE 表使用表级锁定,一次只允许一个会话更新这些表,使它们更适合只读、多读或单用户应用程序

But, the above mentioned behavior of MyISAM tables can be altered by concurrent_insertsystem variable in order to achieve concurrent write. Kindly refer to this linkfor details.

但是,MyISAM 表的上述行为可以通过concurrent_insert系统变量更改以实现并发写入。详情请参阅此链接

Hence, as a matter of fact, MySQL does support concurrent insert for InnoDB and MyISAM storage engine.

因此,事实上 MySQL 确实支持 InnoDB 和 MyISAM 存储引擎的并发插入。

回答by Anatoly

You ask about Deadlock detection, ACID and particulary MVCC, locking and transactions:

您询问死锁检测、ACID 和特别是 MVCC、锁定和事务:

Deadlock Detection and Rollback

死锁检测和回滚

InnoDB automatically detects transaction deadlocks and rolls back a transaction or transactions to break the deadlock. InnoDB tries to pick small transactions to roll back, where the size of a transaction is determined by the number of rows inserted, updated, or deleted. When InnoDB performs a complete rollback of a transaction, all locks set by the transaction are released. However, if just a single SQL statement is rolled back as a result of an error, some of the locks set by the statement may be preserved. This happens because InnoDB stores row locks in a format such that it cannot know afterward which lock was set by which statement.

InnoDB 自动检测事务死锁并回滚一个或多个事务以打破死锁。InnoDB 尝试选择小事务进行回滚,其中事务的大小由插入、更新或删除的行数决定。当 InnoDB 执行一个事务的完全回滚时,该事务设置的所有锁都会被释放。但是,如果由于错误而仅回滚单个 SQL 语句,则可能会保留该语句设置的某些锁。发生这种情况是因为 InnoDB 以一种格式存储行锁,以至于它之后无法知道哪个语句设置了哪个锁。

https://dev.mysql.com/doc/refman/5.6/en/innodb-deadlock-detection.html

https://dev.mysql.com/doc/refman/5.6/en/innodb-deadlock-detection.html

Locking

锁定

The system of protecting a transaction from seeing or changing data that is being queried or changed by other transactions. The locking strategy must balance reliability and consistency of database operations (the principles of the ACID philosophy) against the performance needed for good concurrency. Fine-tuning the locking strategy often involves choosing an isolation level and ensuring all your database operations are safe and reliable for that isolation level.

保护事务免于查看或更改其他事务正在查询或更改的数据的系统。锁定策略必须在数据库操作的可靠性和一致性(ACID 哲学的原则)与良好并发性所需的性能之间取得平衡。微调锁定策略通常涉及选择隔离级别并确保所有数据库操作对于该隔离级别都是安全可靠的。

http://dev.mysql.com/doc/refman/5.5/en/glossary.html#glos_locking

http://dev.mysql.com/doc/refman/5.5/en/glossary.html#glos_locking

ACID

An acronym standing for atomicity, consistency, isolation, and durability. These properties are all desirable in a database system, and are all closely tied to the notion of a transaction. The transactional features of InnoDB adhere to the ACID principles. Transactions are atomic units of work that can be committed or rolled back. When a transaction makes multiple changes to the database, either all the changes succeed when the transaction is committed, or all the changes are undone when the transaction is rolled back. The database remains in a consistent state at all times -- after each commit or rollback, and while transactions are in progress. If related data is being updated across multiple tables, queries see either all old values or all new values, not a mix of old and new values. Transactions are protected (isolated) from each other while they are in progress; they cannot interfere with each other or see each other's uncommitted data. This isolation is achieved through the locking mechanism. Experienced users can adjust the isolation level, trading off less protection in favor of increased performance and concurrency, when they can be sure that the transactions really do not interfere with each other.

代表原子性、一致性、隔离性和持久性的首字母缩写词。这些属性在数据库系统中都是可取的,并且都与事务的概念密切相关。InnoDB 的事务特性遵循 ACID 原则。事务是可以提交或回滚的原子工作单元。当一个事务对数据库进行多次更改时,要么在提交事务时所有更改都成功,要么在回滚事务时撤消所有更改。数据库始终保持一致的状态——在每次提交或回滚之后,以及在事务进行时。如果跨多个表更新相关数据,查询会看到所有旧值或所有新值,而不是旧值和新值的混合。交易在进行中时相互保护(隔离);他们不能相互干扰或看到彼此未提交的数据。这种隔离是通过锁定机制实现的。有经验的用户可以调整隔离级别,在他们可以确定事务确实不会相互干扰的情况下,用较少的保护来换取增加的性能和并发性。

http://dev.mysql.com/doc/refman/5.5/en/glossary.html#glos_acid

http://dev.mysql.com/doc/refman/5.5/en/glossary.html#glos_acid

MVCC

MVCC

InnoDB is a multiversion concurrency control (MVCC) storage engine which means many versions of the single row can exist at the same time. In fact there can be a huge amount of such row versions. Depending on the isolation mode you have chosen, InnoDB might have to keep all row versions going back to the earliest active read view, but at the very least it will have to keep all versions going back to the start of SELECT query which is currently running

InnoDB 是一个多版本并发控制 (MVCC) 存储引擎,这意味着单行的多个版本可以同时存在。事实上,可能有大量这样的行版本。根据您选择的隔离模式,InnoDB 可能必须让所有行版本回到最早的活动读取视图,但至少它必须让所有版本回到当前正在运行的 SELECT 查询的开头

https://www.percona.com/blog/2014/12/17/innodbs-multi-versioning-handling-can-be-achilles-heel/

https://www.percona.com/blog/2014/12/17/innodbs-multi-versioning-handling-can-be-achilles-heel/

回答by Rick James

It depends.

这取决于。

It depends on the client -- some clients allow concurrent access; some will serialize access, thereby losing the expected gain. You have not even specified PHP vs Java vs ... or Apache vs ... or Windows vs ... Many combinations simply do not provide any parallelism.

这取决于客户端——有些客户端允许并发访问;有些会序列化访问,从而失去预期的收益。您甚至没有指定 PHP vs Java vs ... 或 Apache vs ... 或 Windows vs ... 许多组合根本不提供任何并行性。

If different tables, there is only general contention for I/O, CPU, Mutexes on the buffer_pool, etc. A reasonable amount of parallelism is possible.

如果不同的表,则只有 I/O、CPU、buffer_pool 上的互斥锁等的一般争用。合理数量的并行性是可能的。

If same table, it depends on the indexes and access patterns. In some cases the threads will block each other. In some cases it will even "deadlock" and rollback one of the transactions. Deadlocks not only slow you down, but make you retry the inserts.

如果是同一张表,则取决于索引和访问模式。在某些情况下,线程会相互阻塞。在某些情况下,它甚至会“死锁”并回滚其中一项事务。死锁不仅会减慢你的速度,还会让你重试插入。

If you looking for high speed ingestion of a lot of rows, see my blog. It lays out techniques, and points out sever of the ramifications, such as replication, Engine choice, multi-threading.

如果您正在寻找大量行的高速摄取,请参阅我的博客。它列出了一些技术,并指出了一些后果,例如复制、引擎选择、多线程。

Multiple threads inserting into the same tables -- It depend a lot on the values you are providing for any PRIMARYor UNIQUEkeys. It depends on whether other actions are taken in the same transaction. It depends on how much I/O is involved. It depends on whether you are doing single-row inserts, or batching. It depends on ... (Sorry to be vague, but your question is not very specific.)

多个线程插入同一个表 - 这在很大程度上取决于您为 anyPRIMARYUNIQUE键提供的值。这取决于在同一事务中是否采取了其他操作。这取决于涉及多少 I/O。这取决于您是进行单行插入还是批处理。这取决于......(抱歉含糊不清,但您的问题不是很具体。)

If you would like to present specifics on two or three designs, we can discuss the specifics.

如果您想介绍两个或三个设计的细节,我们可以讨论这些细节。