MySQL 如何处理由 mysqldump 创建的巨大行长
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1293529/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to deal with enormous line lengths created by mysqldump
提问by pavium
I'm using mysqldump in a cron job to backup a database with over 2 million rows.
我在 cron 作业中使用 mysqldump 来备份超过 200 万行的数据库。
It creates a text file which can be used to restore the datalog from the command line.
它创建一个文本文件,可用于从命令行恢复数据日志。
I thought it would be useful to edit the dump before a restore as a quickway of changing values and table or column names - at least until I learn more and become confident about doing it with ALTER and UPDATE.
我认为在恢复之前编辑转储作为更改值和表或列名称的快速方法会很有用- 至少在我了解更多信息并对使用 ALTER 和 UPDATE 执行此操作变得有信心之前。
Editing large text files does not bother me, but I was surprised to find that in a 250 megabytedump of my database, there were only about 300 lines. Each line was something like 800k characters long.
编辑大型文本文件并不困扰我,但我惊讶地发现,在我的数据库的250 兆字节转储中,只有大约 300 行。每行大约有 80 万个字符。
Is there another way of generating dumps with more control over line length?
是否有另一种方法可以更好地控制行长来生成转储?
Or should I post-process the dump with tools like sed or Perl?
还是应该使用 sed 或 Perl 等工具对转储进行后处理?
回答by VoteyDisciple
By default, mysqldump
generates only one INSERT
command per table, resulting in one (very long) line of inserted data for each table that got dumped. This is essentially because the "batch" inserts are much faster than if it generated a separate INSERT
query for every record in every table.
默认情况下,每个 tablemysqldump
仅生成一个INSERT
命令,从而为每个转储的table生成一行(很长)插入的数据。这本质上是因为“批量”插入比INSERT
为每个表中的每条记录生成单独的查询要快得多。
So, it's not that mysqldump
has created arbitrarily long lines, and you can just impose some other cutoff length. The lines are long for a reason.
所以,这并不是mysqldump
创建任意长的线,你可以强加一些其他的截止长度。线路很长是有原因的。
If it's really important to get the INSERT
s broken down onto multiple lines, you can indicate that with:
如果将INSERT
s 分解为多行真的很重要,您可以用以下方式表示:
mysqldump --extended-insert=FALSE --complete-insert=TRUE ...
Note, however, that restoring tables will take longer in this format.
但是请注意,以这种格式恢复表将花费更长的时间。
回答by superjer
I was browsing the MySQL source code looking for a solution to this problem today. The maximum line length is enforced by the variable opt_net_buffer_length which is supposed to match the MySQL server's buffer size. It is comically large.
我今天正在浏览 MySQL 源代码以寻找解决此问题的方法。最大行长度由变量 opt_net_buffer_length 强制执行,该变量应该与 MySQL 服务器的缓冲区大小匹配。它可笑地大。
But anyway, it's an option, so just do this:
但无论如何,这是一种选择,所以只需这样做:
mysqldump --net_buffer_length=5000 ...
The minimum value is 4096.
最小值为 4096。
回答by StampyCode
I came across an answer on the MySQL forums, which conclusively shows adding '\n' after each INSERT group is not possible using mysqldump alone, without modifying the source:
我在 MySQL 论坛上找到了一个答案,它最终显示在每个 INSERT 组之后添加 '\n' 是不可能的,单独使用 mysqldump 而不修改源:
The extended format cannot be 100% properly parsed based on the comma or parenthesis, you would to count fields. The best solution, fix mysqldump to linebreak on output.
Very minor change: On line 3506, you can see where the row-ending comma is output:
fputc(',',md_result_file); /* Always row break */
Simply insert this line immediately after line 3506:
fputc('\n',md_result_file); /* Lon Binder says wrap that line! */
re-compile and done.
扩展格式不能基于逗号或括号 100% 正确解析,您将计算字段。最好的解决方案,将 mysqldump 修复为输出时的换行符。
非常小的变化:在第 3506 行,您可以看到行结尾逗号的输出位置:
fputc(',',md_result_file); /* Always row break */
只需在第 3506 行之后立即插入此行:
fputc('\n',md_result_file); /* Lon Binder says wrap that line! */
重新编译并完成。
@see http://forums.mysql.com/read.php?28,420002,426110#msg-426110
@see http://forums.mysql.com/read.php?28,420002,426110#msg-426110
Thanks Lon B!
谢谢朗 B!
(I've included the content from the MySQL forum just in case the forum disappears.)
(我已经包含了 MySQL 论坛中的内容,以防论坛消失。)
回答by Nick Tsai
This flag is also worked:
这个标志也有效:
mysqldump --skip-extended-insert
Just same as --extended-insert=FALSE
.
和 一样--extended-insert=FALSE
。
回答by sebastien
Using a regex to split lines is not enough, you need a parser that will properly understand quotes and escaping characters.
使用正则表达式分割行是不够的,您需要一个能够正确理解引号和转义字符的解析器。
I just wrote a parser since I couldn't find one: http://blog.lavoie.sl/2014/06/split-mysqldump-extended-inserts.html
我刚刚写了一个解析器,因为我找不到一个:http: //blog.lavoie.sl/2014/06/split-mysqldump-extended-inserts.html
回答by S.Lott
Post-process the dump file with python. You may be happier than perl or sed.
使用python对转储文件进行后处理。你可能比 perl 或 sed 更快乐。
If you're running on Linux, you already have it installed. If you're running on Windows, the installer is painless.
如果您在 Linux 上运行,则您已经安装了它。如果您在 Windows 上运行,安装程序是轻松的。
Before that, however, learn to use SQL UPDATE and SQL ALTER. You'll be happiest doing things the right way.
但是,在此之前,请学习使用 SQL UPDATE 和 SQL ALTER。以正确的方式做事会最快乐。