MySQL 使用 mysqldump 格式化每行一个插入?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15750535/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 17:08:38  来源:igfitidea点击:

Using mysqldump to format one insert per line?

mysqlmysqldumpmysql-workbenchmysqladministrator

提问by Kendall Bennett

This has been asked a few times but I cannot find a resolution to my problem. Basically when using mysqldump, which is the built in tool for the MySQL Workbench administration tool, when I dump a database using extended inserts, I get massive long lines of data. I understand why it does this, as it speeds inserts by inserting the data as one command (especially on InnoDB), but the formatting makes it REALLY difficult to actually look at the data in a dump file, or compare two files with a diff tool if you are storing them in version control etc. In my case I am storing them in version control as we use the dump files to keep track of our integration test database.

这已被问过几次,但我找不到解决我的问题的方法。基本上,在使用 mysqldump(MySQL Workbench 管理工具的内置工具)时,当我使用扩展插入转储数据库时,会得到大量长行数据。我理解它为什么这样做,因为它通过将数据作为一个命令插入(尤其是在 InnoDB 上)来加快插入速度,但格式使得实际查看转储文件中的数据或使用 diff 工具比较两个文件变得非常困难如果您将它们存储在版本控制等中。在我的情况下,我将它们存储在版本控制中,因为我们使用转储文件来跟踪我们的集成测试数据库。

Now I know I can turn off extended inserts, so I will get one insert per line, which works, but any time you do a restore with the dump file it will be slower.

现在我知道我可以关闭扩展插入,所以我将每行插入一个,这是可行的,但是任何时候使用转储文件进行恢复都会变慢。

My core problem is that in the OLD tool we used to use (MySQL Administrator) when I dump a file, it does basically the same thing but it FORMATS that INSERT statement to put one insert per line, while still doing bulk inserts. So instead of this:

我的核心问题是,在我们转储文件时使用的旧工具(MySQL 管理员)中,它的作用基本相同,但它格式化了 INSERT 语句以在每行插入一个插入,同时仍然进行批量插入。所以而不是这个:

INSERT INTO `coupon_gv_customer` (`customer_id`,`amount`) VALUES (887,'0.0000'),191607,'1.0300');

you get this:

你得到这个:

INSERT INTO `coupon_gv_customer` (`customer_id`,`amount`) VALUES 
 (887,'0.0000'),
 (191607,'1.0300');

No matter what options I try, there does not seem to be any way of being able to get a dump like this, which is really the best of both worlds. Yes, it take a little more space, but in situations where you need a human to read the files, it makes it MUCH more useful.

无论我尝试什么选择,似乎都没有任何方法可以得到这样的转储,这确实是两全其美的。是的,它需要更多空间,但是在您需要人工读取文件的情况下,它会变得更加有用。

Am I missing something and there is a way to do this with MySQLDump, or have we all gone backwards and this feature in the old (now deprecated) MySQL Administrator tool is no longer available?

我是否遗漏了什么,有一种方法可以使用 MySQLDump 来做到这一点,还是我们都倒退了,旧的(现已弃用)MySQL Administrator 工具中的此功能不再可用?

采纳答案by Todd Blumer

With the default mysqldump format, each record dumped will generate an individual INSERT command in the dump file (i.e., the sql file), each on its own line. This is perfect for source control (e.g., svn, git, etc.) as it makes the diff and delta resolution much finer, and ultimately results in a more efficient source control process. However, for significantly sized tables, executing all those INSERT queries can potentially make restoration from the sql file prohibitively slow.

使用默认的 mysqldump 格式,转储的每个记录都会在转储文件(即 sql 文件)中生成一个单独的 INSERT 命令,每个都在自己的行上。这对于源代码控制(例如 svn、git 等)来说是完美的,因为它使差异和增量分辨率更加精细,并最终导致更有效的源代码控制过程。但是,对于非常大的表,执行所有这些 INSERT 查询可能会使从 sql 文件恢复的速度慢得令人望而却步。

Using the --extended-insert option fixes the multiple INSERT problem by wrapping all the records into a single INSERT command on a single line in the dumped sql file. However, the source control process becomes very inefficient. The entire table contents is represented on a single line in the sql file, and if a single character changes anywhere in that table, source control will flag the entire line (i.e., the entire table) as the delta between versions. And, for large tables, this negates many of the benefits of using a formal source control system.

使用 --extended-insert 选项通过将所有记录包装到转储的 sql 文件中的一行上的单个 INSERT 命令来修复多个 INSERT 问题。但是,源代码控制过程变得非常低效。整个表的内容在 sql 文件的一行中表示,如果该表中任何地方的单个字符发生变化,源代码管理将把整行(即整个表)标记为版本之间的增量。而且,对于大型表,这会抵消使用正式源代码控制系统的许多好处。

So ideally, for efficient database restoration, in the sql file, we want each table to be represented by a single INSERT. For an efficient source control process, in the sql file, we want each record in that INSERT command to reside on its own line.

所以理想情况下,为了高效的数据库恢复,在 sql 文件中,我们希望每个表都由单个 INSERT 表示。为了有效的源代码控制过程,在 sql 文件中,我们希望该 INSERT 命令中的每条记录都驻留在自己的行上。

My solution to this is the following back-up script:

我对此的解决方案是以下备份脚本:

#!/bin/bash

cd my_git_directory/

ARGS="--host=myhostname --user=myusername --password=mypassword --opt --skip-dump-date"
/usr/bin/mysqldump $ARGS --database mydatabase | sed 's$VALUES ($VALUES\n($g' | sed 's$),($),\n($g' > mydatabase.sql

git fetch origin master
git merge origin/master
git add mydatabase.sql
git commit -m "Daily backup."
git push origin master

The result is a sql file INSERT command format that looks like:

结果是一个 sql 文件 INSERT 命令格式,如下所示:

INSERT INTO `mytable` VALUES
(r1c1value, r1c2value, r1c3value),
(r2c1value, r2c2value, r2c3value),
(r3c1value, r3c2value, r3c3value);

Some notes:

一些注意事项:

  • password on the command line ... I know, not secure, different discussion.
  • --opt: Among other things, turns on the --extended-insert option (i.e., one INSERT per table).
  • --skip-dump-date: mysqldump normally puts a date/time stamp in the sql file when created. This can become annoying in source control when the only delta between versions is that date/time stamp. The OS and source control system will date/time stamp the file and version. Its not really needed in the sql file.
  • The git commands are not central to the fundamental question (formatting the sql file), but shows how I get my sql file back into source control, something similar can be done with svn. When combining this sql file format with your source control of choice, you will find that when your users update their working copies, they only need to move the deltas (i.e., changed records) across the internet, and they can take advantage of diff utilities to easily see what records in the database have changed.
  • If you're dumping a database that resides on a remote server, if possible, run this script on that server to avoid pushing the entire contents of the database across the network with each dump.
  • If possible, establish a working source control repository for your sql files on the same server you are running this script from; check them into the repository from there. This will also help prevent having to push the entire database across the network with every dump.
  • 命令行上的密码......我知道,不安全,不同的讨论。
  • --opt:除其他外,打开 --extended-insert 选项(即每个表一个 INSERT)。
  • --skip-dump-date:mysqldump 通常在创建时在 sql 文件中放置一个日期/时间戳。当版本之间的唯一增量是日期/时间戳时,这在源代码控制中会变得很烦人。操作系统和源代码控制系统将为文件和版本添加日期/时间戳。在 sql 文件中并不真正需要它。
  • git 命令不是基本问题(格式化 sql 文件)的核心,但显示了我如何将 sql 文件恢复到源代码管理中,类似的事情可以用 svn 完成。将此 sql 文件格式与您选择的源代码控制相结合时,您会发现当您的用户更新他们的工作副本时,他们只需要在 Internet 上移动增量(即更改的记录),并且他们可以利用差异实用程序轻松查看数据库中的哪些记录发生了变化。
  • 如果您要转储驻留在远程服务器上的数据库,请尽可能在该服务器上运行此脚本,以避免每次转储都将数据库的全部内容推送到网络上。
  • 如果可能,请在运行此脚本的同一台服务器上为 sql 文件建立一个有效的源代码控制存储库;从那里将它们签入存储库。这也将有助于避免在每次转储时都必须通过网络推送整个数据库。

回答by Eric Tan

Try use the following option: --skip-extended-insert

尝试使用以下选项:-- skip-extended-insert

It worked for me.

它对我有用。

回答by Ace.Di

As others have said using sed to replace "),(" is not safe as this can appear as content in the database. There is a way to do this however: if your database name is my_database then run the following:

正如其他人所说,使用 sed 替换 "),(" 并不安全,因为这可能会作为数据库中的内容出现。但是有一种方法可以做到这一点:如果您的数据库名称是 my_database,则运行以下命令:

$ mysqldump -u my_db_user -p -h 127.0.0.1 --skip-extended-insert my_database > my_database.sql
$ sed ':a;N;$!ba;s/)\;\nINSERT INTO `[A-Za-z0-9$_]*` VALUES /),\n/g' my_database.sql > my_database2.sql

you can also use "sed -i" to replace in-line.

您还可以使用“sed -i”来替换内联。

Here is what this code is doing:

下面是这段代码的作用:

  1. --skip-extended-insert will create one INSERT INTO for every row you have.
  2. Now we use sed to clean up the data. Note that regular search/replace with sed applies for single line so we cannot detect the "\n" character as sed works one line at a time. That is why we put ":a;N;$!ba;" which basically tells sed to search multi-line and buffer the next line.
  1. --skip-extended-insert 将为您拥有的每一行创建一个 INSERT INTO。
  2. 现在我们使用 sed 来清理数据。请注意,使用 sed 进行常规搜索/替换适用于单行,因此我们无法检测到“\n”字符,因为 sed 一次只处理一行。这就是为什么我们把 ":a;N;$!ba;" 它基本上告诉 sed 搜索多行并缓冲下一行。

Hope this helps

希望这可以帮助

回答by Cristian Porta

What about storing the dump into a CSV file with mysqldump, using the --taboption like this?

使用--tab像这样的选项将转储存储到带有 mysqldump 的 CSV 文件怎么样?

mysqldump --tab=/path/to/serverlocaldir --single-transaction <database> table_a

This produces two files:

这会产生两个文件:

  • table_a.sqlthat contains only the table create statement; and
  • table_a.txtthat contains tab-separated data.
  • table_a.sql仅包含 table create 语句;和
  • table_a.txt包含制表符分隔的数据。

RESTORING

恢复

You can restore your table via LOAD DATA:

您可以通过LOAD DATA以下方式恢复您的表:

LOAD DATA INFILE '/path/to/serverlocaldir/table_a.txt' 
  INTO TABLE table_a FIELDS TERMINATED BY '\t' ...

LOAD DATA is usually 20 times faster than using INSERT statements.

LOAD DATA 通常比使用 INSERT 语句快 20 倍。

If you have to restore your data into another table (e.g. for review or testing purposes) you can create a "mirror" table:

如果您必须将数据恢复到另一个表中(例如用于或测试目的),您可以创建一个“镜像”表:

CREATE TABLE table_for_test LIKE table_a;

Then load the CSV into the new table:

然后将 CSV 加载到新表中:

LOAD DATA INFILE '/path/to/serverlocaldir/table_a.txt' 
  INTO TABLE table_for_test FIELDS TERMINATED BY '\t' ...

COMPARE

相比

A CSV file is simplest for diffs or for looking inside, or for non-SQL technical users who can use common tools like Excel, Accessor command line (diff, comm, etc...)

CSV文件是最简单的的不同之处或为寻找内,或对非SQL技术的用户谁可以使用常见的工具,如ExcelAccess或命令行(diffcomm,等...)

回答by Mike Lischke

I'm afraid this won't be possible. In the old MySQL Administrator I wrote the code for dumping db objects which was completely independent of the mysqldump tool and hence offered a number of additional options (like this formatting or progress feedback). In MySQL Workbench it was decided to use the mysqldump tool instead which, besides being a step backwards in some regards and producing version problems, has the advantage to stay always up-to-date with the server.

恐怕这是不可能的。在旧的 MySQL Administrator 中,我编写了完全独立于 mysqldump 工具的转储 db 对象的代码,因此提供了许多附加选项(如这种格式设置或进度反馈)。在 MySQL Workbench 中,决定使用 mysqldump 工具代替,除了在某些方面退步和产生版本问题外,还有一个优点是始终与服务器保持同步。

So the short answer is: formatting is currently not possible with mysqldump.

所以简短的回答是:目前无法使用 mysqldump 进行格式化。

回答by Humberto Roldán

Try this:

尝试这个:

mysqldump -c -t --add-drop-table=FALSE --skip-extended-insert -uroot -p<Password> databaseName tableName >c:\path\nameDumpFile.sql

回答by seanf

I found this tool very helpful for dealing with extended inserts: http://blog.lavoie.sl/2014/06/split-mysqldump-extended-inserts.html

我发现这个工具对于处理扩展插入非常有帮助:http: //blog.lavoie.sl/2014/06/split-mysqldump-extended-inserts.html

It parses the mysqldump output and inserts linebreaks after each record, but still using the faster extended inserts. Unlike a sed script, there shouldn't be any risk of breaking lines in the wrong place if the regex happens to match inside a string.

它解析 mysqldump 输出并在每条记录后插入换行符,但仍使用更快的扩展插入。与 sed 脚本不同,如果正则表达式恰好在字符串内匹配,则不应该有在错误位置断行的风险。

回答by Kjeld Flarup

I liked Ace.Di's solution with sed, until I got this error: sed: Couldn't re-allocate memory

我喜欢 Ace.Di 的 sed 解决方案,直到出现此错误:sed:无法重新分配内存

Thus I had to write a small PHP script

因此我不得不写一个小的 PHP 脚本

mysqldump -u my_db_user -p -h 127.0.0.1 --skip-extended-insert my_database | php mysqlconcatinserts.php > db.sql

The PHP script also generates a new INSERT for each 10.000 rows, again to avoid memory problems.

PHP 脚本还为每 10.000 行生成一个新的 INSERT,再次避免内存问题。

mysqlconcatinserts.php:

mysqlconcatinserts.php:

#!/usr/bin/php
<?php
/* assuming a mysqldump using --skip-extended-insert */
$last = '';
$count = 0;
$maxinserts = 10000;
while($l = fgets(STDIN)){
  if ( preg_match('/^(INSERT INTO .* VALUES) (.*);/',$l,$s) )
  {
    if ( $last != $s[1] || $count > $maxinserts )
    {
      if ( $count > $maxinserts ) // Limit the inserts
        echo ";\n";
      echo "$s[1] ";
      $comma = ''; 
      $last = $s[1];
      $count = 0;
    }
    echo "$comma$s[2]";
    $comma = ",\n";
  } elseif ( $last != '' ) {
    $last = '';
    echo ";\n";
  }
  $count++;
} 

回答by jinzuchi

add

添加

set autocommit=0;

to first line of your sql script file, then import by:

到 sql 脚本文件的第一行,然后通过以下方式导入:

mysql -u<user> -p<password> --default-character-set=utf8 db_name < <path>\xxx.sql

, it will fast 10x.

,它将快 10 倍。