mysql 如何更快地插入数百万条记录?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19682414/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How can mysql insert millions records faster?
提问by Cricket
I wanted to insert about millions records into my database, but it went very slow with a speed about 40,000 records/hour, I dont think that my hardware is too slow, because i saw the diskio is under 2 MiB/s. I have many tables seperated in different .sql-files. One single record is also very simple, one record has less than 15 columns and one column has less than 30 characters. I did this job under archlinux with mysql 5.3. Do you guys have any ideas? Or is this speed not slow?
我想在我的数据库中插入大约数百万条记录,但它以大约 40,000 条记录/小时的速度进行得很慢,我不认为我的硬件太慢,因为我看到磁盘速度低于 2 MiB/s。我有很多表分开在不同的 .sql 文件中。单条记录也很简单,一条记录少于15列,一列少于30个字符。我在 archlinux 下用 mysql 5.3 完成了这项工作。你们有什么想法吗?还是这个速度不慢?
回答by vallentin
It's most likely because you're inserting records like this:
这很可能是因为您正在插入这样的记录:
INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");
INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");
INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");
INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");
INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");
Sending a new query each time you need to INSERT
something is bad for performance. Instead combine those queries into a single query, like this.
每次需要时发送新查询INSERT
对性能不利。而是将这些查询组合成一个查询,就像这样。
INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2"),
("data1", "data2"),
("data1", "data2"),
("data1", "data2"),
("data1", "data2");
You can also read more about insert speed in the MySQL Docs. It clearly describs the following.
您还可以在MySQL Docs 中阅读有关插入速度的更多信息。它清楚地描述了以下内容。
To optimize insert speed, combine many small operations into a single large operation. Ideally, you make a single connection, send the data for many new rows at once, and delay all index updates and consistency checking until the very end.
要优化插入速度,请将许多小操作合并为一个大操作。理想情况下,您建立一个连接,一次发送许多新行的数据,并将所有索引更新和一致性检查延迟到最后。
Of course don't combine ALL of them, if the amount is HUGE. Say you have 1000 rows you need to insert, then don't do it one at a time. But you probably shouldn't equally try to have all 1000 rows in a single query. Instead break it into smaller sizes.
当然,如果数量很大,请不要将它们全部组合在一起。假设您需要插入 1000 行,然后不要一次插入一行。但是您可能不应该同样尝试在单个查询中包含所有 1000 行。而是将其分解成更小的尺寸。
If it's still really slow, then it might just be because your server is slow.
如果它仍然很慢,那么可能只是因为您的服务器很慢。
Note that you of course don't need all those spaces in the combined query, that is simply to get a better overview of the answer.
请注意,您当然不需要组合查询中的所有这些空格,这只是为了更好地了解答案。