为什么 TRANSACTION / COMMIT 使用 PHP/MySQL (InnoDB) 提高了这么多性能?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14675147/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 07:46:38  来源:igfitidea点击:

Why does TRANSACTION / COMMIT improve performance so much with PHP/MySQL (InnoDB)?

phpmysqlcsvtransactionsinnodb

提问by jjwdesign

I've been working with importing large CSV files of data; usually less than 100,000 records. I'm working with PHP and MySQL (InnoDB tables). I needed to use PHP to transform some fields and do some text processing prior to the MySQL INSERTs (part of process_note_data()in code below). MySQL's LOAD DATAwas not feasible, so please do not suggest it.

我一直在导入大型 CSV 数据文件;通常少于 100,000 条记录。我正在使用 PHP 和 MySQL(InnoDB 表)。我需要使用 PHP 来转换一些字段并在 MySQL 之前进行一些文本处理INSERTprocess_note_data()下面代码中的一部分)。MySQL的LOAD DATA不可行,所以请不要建议。

I recently tried to improve the speed of this process by using MySQL transactions using START TRANSACTIONand COMMIT. The performance increase was surprising. Processing time(s) dropped by a factor of 20. So, a 20 minute process only took about 1 minute.

我最近尝试通过使用START TRANSACTION和使用 MySQL 事务来提高此过程的速度COMMIT。性能提升出人意料。处理时间减少了 20 倍。因此,20 分钟的过程只需要大约 1 分钟。

QUESTIONS.

问题。

1.) Does anyone understand why there was such performance increase (20 mins to 1 min)?

1.) 有没有人理解为什么会有这样的性能提升(20 分钟到 1 分钟)?

2.) Should I be concerned about how big the transaction may get with 100,000 records?

2.) 我是否应该担心 100,000 条记录的交易量可能会增加多少?

3.) Should I be concerned with a large number of inserts and/or updates in the transaction?

3.) 我是否应该关注事务中的大量插入和/或更新?

/*
 * Customer Notes Data:
 * Rows are either a meeting, call or note!
 */
$row = 1;
$data = array();
$fields = array();
$line = '';

$db->query('SET autocommit=0;');
$db->query('START TRANSACTION;');

if (($handle = fopen("modules/".$currentModule."/Data/customernote.csv", "r")) !== FALSE) {
  while (($data = fgetcsv($handle, 4096, ',', '"')) !== FALSE && $row < 999000) {
    //Row 1 - CSV header row with field names
    if ($row == 1) {
      $csv_fields = $data;
    } elseif ($row > 1) {
      $fields = $this->process_note_data($data, $csv_fields, $row);
    }
    $row++;
  } // end while
  fclose($handle);
}

$db->query('COMMIT;');
$db->query('SET autocommit=1;');

Note: The text/field processing is done in the call to $this->process_note_data()which then calls another helper class that has the INSERTstatement code. I didn't have enough room to include all of the code. $db->query()is a typical database object for MySQL queries.

注意:文本/字段处理在调用中完成$this->process_note_data(),然后调用另一个具有INSERT语句代码的帮助器类。我没有足够的空间来包含所有代码。$db->query()是 MySQL 查询的典型数据库对象。

采纳答案by MiGro

  1. Please check this link:

    https://dev.mysql.com/doc/refman/5.5/en/optimizing-innodb-transaction-management.html

    InnoDB must flush the log to disk at each transaction commit if that transaction made modifications to the database. When each change is followed by a commit (as with the default autocommit setting), the I/O throughput of the storage device puts a cap on the number of potential operations per second.

  2. Big transactions may affect performance during commit (check above)

  3. Only in case of rollback, however it may be optimized using some settings (check the link)

  1. 请检查此链接:

    https://dev.mysql.com/doc/refman/5.5/en/optimizing-innodb-transaction-management.html

    如果该事务对数据库进行了修改,则 InnoDB 必须在每个事务提交时将日志刷新到磁盘。当每次更改后都有一个提交(与默认的自动提交设置一样)时,存储设备的 I/O 吞吐量会限制每秒潜在操作的数量。

  2. 大事务可能会影响提交期间的性能(检查上面)

  3. 仅在回滚的情况下,但可以使用某些设置对其进行优化(检查链接)

回答by MrCalvin

My own little test in .Net (4 fields pr. records):

我自己在 .Net 中的小测试(4 个字段 pr.记录):

INSERT 1 record, no transaction:60ms

INSERT 1 record, using transaction:158ms

INSERT 200 records using transactions, commit after each record:17778ms

INSERT 200 records using no transactions:4940ms

INSERT 200 records using transactions, only commit after last record:4552ms

INSERT 1000 records using transactions, only commit after last record:21795ms

插入 1 条记录,无事务:60毫秒

插入 1 条记录,使用事务:158毫秒

使用事务插入 200 条记录,在每条记录后提交:17778毫秒

不使用事务插入 200 条记录:4940毫秒

使用事务插入 200 条记录,仅在最后一条记录后提交:4552毫秒

使用事务插入 1000 条记录,仅在最后一条记录后提交:21795毫秒

Client in Denmark, server in Belgium (Google cloud f1-micro).

丹麦的客户端,比利时的服务器(谷歌云 f1-micro)。

I meant to put this in a comment but the formatting is not good....so here is my apology in advance ;-)

我想把它放在评论中,但格式不好......所以这是我提前道歉;-)