MySQL 加载数据 infile - 加速?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2463602/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 15:33:11  来源:igfitidea点击:

MySQL load data infile - acceleration?

mysqlperformanceindexingload-data-infile

提问by DBa

sometimes, I have to re-import data for a project, thus reading about 3.6 million rows into a MySQL table (currently InnoDB, but I am actually not really limited to this engine). "Load data infile..." has proved to be the fastest solution, however it has a tradeoff: - when importing without keys, the import itself takes about 45 seconds, but the key creation takes ages (already running for 20 minutes...). - doing import with keys on the table makes the import much slower

有时,我必须为一个项目重新导入数据,从而将大约 360 万行读入 MySQL 表(目前是 InnoDB,但实际上我并不仅限于这个引擎)。“加载数据文件...”已被证明是最快的解决方案,但它有一个权衡: - 在没有密钥的情况下导入时,导入本身需要大约 45 秒,但密钥创建需要很长时间(已经运行了 20 分钟.. .) - 使用表上的键进行导入会使导入速度变慢

There are keys over 3 fields of the table, referencing numeric fields. Is there any way to accelerate this?

表的 3 个字段上有键,引用数字字段。有什么办法可以加速这个吗?

Another issue is: when I terminate the process which has started a slow query, it continues running on the database. Is there any way to terminate the query without restarting mysqld?

另一个问题是:当我终止启动慢查询的进程时,它会继续在数据库上运行。有没有办法在不重新启动 mysqld 的情况下终止查询?

Thanks a lot DBa

非常感谢 DBa

回答by Jon Black

if you're using innodb and bulk loading here are a few tips:

如果您正在使用 innodb 和批量加载,这里有一些提示:

sort your csv file into the primary key order of the target table : remember innodb uses clustered primary keys so it will load faster if it's sorted !

将您的 csv 文件按目标表的主键顺序排序:记住 innodb 使用集群主键,因此如果排序它会加载得更快!

typical load data infile i use:

我使用的典型负载数据 infile:

truncate <table>;

set autocommit = 0;

load data infile <path> into table <table>...

commit;

other optimisations you can use to boost load times:

您可以用来提高加载时间的其他优化:

set unique_checks = 0;
set foreign_key_checks = 0;
set sql_log_bin=0;

split the csv file into smaller chunks

将 csv 文件拆分为更小的块

typical import stats i have observed during bulk loads:

我在批量加载期间观察到的典型导入统计数据:

3.5 - 6.5 million rows imported per min
210 - 400 million rows per hour

回答by Ike Walker

This blog post is almost 3 years old, but it's still relevant and has some good suggestions for optimizing the performance of "LOAD DATA INFILE":

这篇博文已经快 3 年了,但它仍然具有相关性,并且对于优化“LOAD DATA INFILE”的性能有一些很好的建议:

http://www.mysqlperformanceblog.com/2007/05/24/predicting-how-long-data-load-would-take/

http://www.mysqlperformanceblog.com/2007/05/24/predicting-how-long-data-load-would-take/

回答by user2163461

InnoDB is a pretty good engine. However, it highly relies on being 'tuned'. One thing is that if your inserts are not in the order of increasing primary keys, innoDB can take a bit longer than MyISAM. This can easily be overcome by setting a higher innodb_buffer_pool_size. My suggestion is to set it at 60-70% of your total RAM on a dedicated MySQL machine.

InnoDB 是一个非常好的引擎。但是,它高度依赖于“调整”。一件事是,如果您的插入不是按照增加主键的顺序,innoDB 可能需要比 MyISAM 更长的时间。这可以通过设置更高的 innodb_buffer_pool_size 轻松克服。我的建议是在专用 MySQL 机器上将其设置为总 RAM 的 60-70%。