使用 Git 跟踪 mysql 模式 - 一些问题

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5518349/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 05:16:14  来源:igfitidea点击:

Using Git to track mysql schema - some questions

mysqlgit

提问by MEM

If this is recommended ?

如果这是推荐?

Can I ask some git command examples about how to track versions of mysql schema?

我可以问一些关于如何跟踪 mysql 模式版本的 git 命令示例吗?

Should we use another repository other then the one we normally use on our application root ?

除了我们通常在应用程序根目录上使用的存储库之外,我们是否应该使用另一个存储库?

Should I use something called hook ?

我应该使用称为 hook 的东西吗?

Update:

更新:

1)We navigate onto our project root where .git database resides.

1)我们导航到 .git 数据库所在的项目根目录。

2)We create a sub folder called hooks.

2)我们创建一个名为 hooks 的子文件夹。

3)We put something like this inside a file called db-commit:

3)我们把这样的东西放在一个名为 db-commit 的文件中:

   #!/bin/sh
   mysqldump -u DBUSER -pDBPASSWORD  DATABASE --no-data=true> SQLVersionControl/vc.sql
   git add SQLVersionControl/vc.sql
   exit 0

Now we can:

现在我们可以:

4)git commit -m

4)git commit -m

This commit will include a mysql schema dump that has been run just before the commit.

此提交将包括在提交之前运行的 mysql 模式转储。

The source of the above is here: http://edmondscommerce.github.io/git/using-git-to-track-db-schema-changes-with-git-hook.html

以上的来源在这里:http: //edmondscommerce.github.io/git/using-git-to-track-db-schema-changes-with-git-hook.html

If this is an acceptable way of doing it, can I please ask someone with patience to comment line by line and with as much detail as possible, what is happening here:

如果这是一种可以接受的方式,我能否请有耐心的人逐行评论并尽可能详细,这里发生了什么:

#!/bin/sh
mysqldump -u DBUSER -pDBPASSWORD  DATABASE --no-data=true> SQLVersionControl/vc.sql
git add SQLVersionControl/vc.sql
exit 0

Thanks a lot.

非常感谢。

采纳答案by mkb

Assuming you have a git repo already, do the following in a shell script or whatever:

假设您已经有一个 git repo,请在 shell 脚本或其他任何内容中执行以下操作:

#!/bin/bash -e
# -e means exit if any command fails
DBHOST=dbhost.yourdomain.com
DBUSER=dbuser
DBPASS=dbpass # do this in a more secure fashion
DBNAME=dbname
GITREPO=/path/to/git/repo
cd $GITREPO
mysqldump -h $DBHOST -u $DBUSER -p$DBPASS -d $DBNAME > $GITREPO/schema.sql # the -d flag means "no data"
git add schema.sql
git commit -m "$DBNAME schema version $(`date`)"
git push # assuming you have a remote to push to

Then start this script on a daily basis from a cron job or what have you.

然后每天从 cron 作业或你有什么开始这个脚本。

EDIT: By placing a script in $gitdir/hooks/pre-commit (the name is important), the script will be executed before every commit. This way the state of the DB schema is captured for each commit, which makes sense. If you automatically run this sql script every time you commit, you will blow away your database, which does not make sense.

编辑:通过在 $gitdir/hooks/ pre-commit 中放置一个脚本(名称很重要),脚本将在每次提交之前执行。这样,每次提交都会捕获 DB 模式的状态,这是有道理的。如果每次提交都自动运行这个sql脚本,会炸掉你的数据库,这是没有意义的。

#!/bin/sh

This line specifies that it's a shell script.

此行指定它是一个 shell 脚本。

mysqldump -u DBUSER -pDBPASSWORD  DATABASE --no-data=true> SQLVersionControl/vc.sql

This is the same as in my answer above; taking the DDL only from the database and storing it in a file.

这与我上面的回答相同;仅从数据库中获取 DDL 并将其存储在文件中。

git add SQLVersionControl/vc.sql

This adds the SQL file to every commit made to your repository.

这会将 SQL 文件添加到对存储库所做的每次提交。

exit 0

This exits the script with success. This is possibly dangerous. If mysqldumpor git addfails, you may blow away something you wanted to keep.

这将成功退出脚本。这可能是危险的。如果失败mysqldumpgit add失败,你可能会吹走你想保留的东西。

回答by Chris Eberle

If you're just tracking the schema, put all of the CREATE statements into one .sql file, and add the file to git.

如果您只是跟踪架构,请将所有 CREATE 语句放入一个 .sql 文件中,然后将该文件添加到 git。

$> mkdir myschema && cd myschema
$> git init
$> echo "CREATE TABLE ..." > schema.sql
$> git add schema.sql
$> git commit -m "Initial import"

回答by vanto

IMO the best approach is described here: http://viget.com/extend/backup-your-database-in-git. For your convenience I repeat the most important pieces here.

IMO 此处描述了最佳方法:http: //viget.com/extend/backup-your-database-in-git。为了您的方便,我在这里重复最重要的部分。

The trick is to use mysqldump --skip-extended-insert, which creates dumps that can be better tracked/diffed by git.

诀窍是使用mysqldump --skip-extended-insert,它会创建可以由 git 更好地跟踪/区分的转储。

There are also some hints regarding the best repository configuration in order to reduce disk size. Copied from here:

还有一些关于最佳存储库配置的提示,以减少磁盘大小。从这里复制:

  • core.compression = 9: Flag for gzip to specify the compression level for blobs and packs. Level 1 is fast with larger file sizes, level 9 takes more time but results in better compression.
  • repack.usedeltabaseoffset = true: Defaults to false for compatibility reasons, but is supported with Git >=1.4.4.
  • pack.windowMemory = 100m: (Re)packing objects may consume lots of memory. To prevent all your resources go down the drain it's useful to put some limits on that. There is also pack.deltaCacheSize.
  • pack.window = 15: Defaults to 10. With a higher value, Git tries harder to find similar blobs.
  • gc.auto = 1000: Defaults to 6700. As indicated in the article it is recommended to run git gc every once in a while. Personally I run git gc --auto everyday, so only pack things when there's enough garbage. git gc --auto normally only triggers the packing mechanism when there are 6700 loose objects around. This flag lowers this amount.
  • gc.autopacklimit = 10: Defaults to 50. Every time you run git gc, a new pack is generated of the loose objects. Over time you get too many packs which waste space. It is a good idea to combine all packs once in a while into a single pack, so all objects can be combined and deltified. By default git gc does this when there are 50 packs around. But for this situation a lower number may be better.
  • core.compression = 9:gzip 的标志,用于指定 blob 和包的压缩级别。级别 1 在文件较大时速度较快,级别 9 需要更多时间,但压缩效果更好。
  • repack.usedeltabaseoffset = true:出于兼容性原因,默认为 false,但 Git >=1.4.4 支持。
  • pack.windowMemory = 100m:(重新)打包对象可能会消耗大量内存。为防止您的所有资源付之东流,对其进行一些限制很有用。还有 pack.deltaCacheSize。
  • pack.window = 15:默认为 10。值越高,Git 越努力寻找类似的 blob。
  • gc.auto = 1000: 默认为 6700。如文章中所述,建议每隔一段时间运行一次 git gc。我个人每天都运行 git gc --auto,所以只有在有足够的垃圾时才打包东西。git gc --auto 通常只会在周围有 6700 个松散物体时触发打包机制。这个标志降低了这个数额。
  • gc.autopacklimit = 10:默认为 50。每次运行 git gc 时,都会生成一个新的松散对象包。随着时间的推移,你会得到太多浪费空间的包。偶尔将所有包组合成一个包是一个好主意,这样所有对象都可以组合和删除。默认情况下,当有 50 个包时 git gc 会执行此操作。但对于这种情况,较低的数字可能会更好。

Old versions can be pruned via:

可以通过以下方式修剪旧版本:

git rebase --onto master~8 master~7

(copied from here)

(从这里复制)

回答by Tuncay G?ncüo?lu

As brilliant as it sounds (the idea did occur to me as well), when I tried to implement it, I hit a wall. In theory, by using the --skip-extended-insert flag, despite initial dump would be big, the diffs between daily dumps should be minimal, hence the size increase over time of the repository could be assumed to be minimal as well, right? Wrong!

尽管听起来很棒(我也确实想到了这个想法),但当我尝试实施它时,却碰壁了。理论上,通过使用 --skip-extended-insert 标志,尽管初始转储会很大,但每日转储之间的差异应该很小,因此存储库的大小随时间的增加也可以假设为最小,对吧? 错误的!

Git stores shapshots, not diffs, which means on each commit, it will take the entire dump file, not just the diff. Moreover, since the dump with --skip-extended-instert will use all field names on every single insert line, it will be huge compared to a dump done without --skip-extended-instert. This results in an explosion in size, the exact opposite what one would expect.

Git 存储 shapshots,而不是 diff,这意味着在每次提交时,它将获取整个转储文件,而不仅仅是 diff。此外,由于带有--skip-extended-instert 的转储将使用每个插入行上的所有字段名称,与没有--skip-extended-instert 的转储相比,它将是巨大的。这导致规模爆炸,与人们预期的完全相反。

In my case, with a ~300MB sql dump, the repository went to gigabytes in days. So, what did I do? I first tried the same thing, only remove --skip-extended-instert, so that dumps will be smaller, and snapshots would be proportionally smaller as well. This approach held for a while, but in time it became unusable as well.

在我的情况下,通过大约 300MB 的 sql 转储,存储库在几天内达到了千兆字节。那么,我做了什么?我首先尝试了同样的事情,只删除 --skip-extended-instert,这样转储会更小,快照也会成比例地更小。这种方法持续了一段时间,但随着时间的推移它也变得不可用了。

Still, the diff usage with --skip-extended-insert actually still seemed like a good idea, only, now I try to use subversion instead of git. I know, compared to git, svn is ancient history, yet it seems to work better, since it actually does use diffs instead of snapshots.

尽管如此,--skip-extended-insert 的差异用法实际上仍然是一个好主意,只是,现在我尝试使用 subversion 而不是 git。我知道,与 git 相比,svn 是古老的历史,但它似乎工作得更好,因为它实际上确实使用了差异而不是快照。

So in short, I believe best solution is doing the above, but with subversion instead of git.

简而言之,我相信最好的解决方案是执行上述操作,但使用 subversion 而不是 git。

回答by Johnny Utahh

The following includes a git pre-commit hook to capture mysql database/schema, given user='myuser', password='mypassword', database_name='dbase1'. Properly bubbles errors up to the git system (the exit 0's in other answers could be dangerous and may not handle error scenarios properly). Optionally, can add a database import to a post-checkout hook (when capturing all the data, not just schema), but take care given your database size. Details in bash-script comments below.

下面包括一个 git pre-commit 钩子来捕获 mysql 数据库/模式,给定 user='myuser', password='mypassword', database_name='dbase1'。正确地将错误冒泡到 git 系统(exit 0其他答案中的's 可能很危险,可能无法正确处理错误情况)。或者,可以将数据库导入添加到结帐后挂钩(捕获所有数据时,而不仅仅是模式),但要注意您的数据库大小。下面的 bash 脚本注释中的详细信息。

pre-commit hook:

预提交钩子:

#!/bin/bash

# exit upon error
set -e
# another way to set "exit upon error", for readability
set -o errexit

mysqldump -umyuser -pmypassword dbase1 --no-data=true > dbase1.sql

# Uncomment following line to dump all data with schema,
# useful when used in tandem for the post-checkout hook below.
# WARNING: can greatly expand your git repo when employing for
# large databases, so carefully evaluate before employing this method.
# mysqldump -umyuser -pmypassword dbase1 > dbase1.sql

git add dbase1.sql

(optional) post-checkout hook:

(可选)结帐后挂钩:

#!/bin/bash
# mysqldump (above) is presumably run without '--no-data=true' parameter.
set -e
mysql -umyuser -pmypassword dbase1 < dbase1.sql

Versions of apps, OS I'm running:

应用程序版本,我正在运行的操作系统:

root@node1 Dec 12 22:35:14 /var/www# mysql --version
mysql  Ver 14.14 Distrib 5.1.54, for debian-linux-gnu (x86_64) using readline 6.2
root@node1 Dec 12 22:35:19 /var/www# git --version
git version 1.7.4.1
root@node1 Dec 12 22:35:22 /var/www# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 11.04
Release:        11.04
Codename:       natty
root@node1 Dec 12 22:35:28 /var/www#

回答by apesa

While I am not using Git, I have used source control for over 15 years. A best practice to adhere to when deciding where and how to store your src and accompanying resources in Source Control: If the DB Schema is used within the project then you should be versioning the schema and all other project resources in "that" project. If you develop a set of schemas or programming resources that you resuse in other projects then you should have a seperate repository for those reusable resources. That seperate Reusable resources project will be versioned on it's own and will track the versions of the actual reusable resources in that repository.

虽然我没有使用 Git,但我使用源代码控制已经超过 15 年了。在决定在源代码管理中存储您的 src 和随附资源的位置和方式时要遵守的最佳实践:如果在项目中使用了 DB 架构,那么您应该对“那个”项目中的架构和所有其他项目资源进行版本控制。如果您开发了一组在其他项目中重复使用的模式或编程资源,那么您应该为这些可重复使用的资源建立一个单独的存储库。该单独的可重用资源项目将自行进行版本控制,并将跟踪该存储库中实际可重用资源的版本。

If you use a versioned resource out of the reusable repository in a different project then you have the following scenario, (just an example). Project XYZ version 1.0 is now using DB Schema_ABC version 4.0 In this case you will understand that you have used a specific version of a reusable resource and since it is versioned you will be able to track its use throughout your project. If you get a bug report on DBSchema_ABC, you will be able to fix the schema and re-version as well as understand where else DBSchem_ABC is used and where you may have to make some changes. From there you will also understand which projects contain wich versions of which reusable resources... You just have to understand how to track your resources.

如果您在不同项目中使用可重用存储库中的版本化资源,那么您将遇到以下情况(仅作为示例)。项目 XYZ 版本 1.0 现在使用 DB Schema_ABC 版本 4.0 在这种情况下,您将了解您使用了可重用资源的特定版本,并且由于它是版本化的,因此您将能够在整个项目中跟踪它的使用情况。如果您收到有关 DBSchema_ABC 的错误报告,您将能够修复架构和重新版本,并了解 DBSchem_ABC 的其他用途以及您可能需要进行一些更改的地方。从那里您还将了解哪些项目包含哪些版本的可重用资源......您只需要了解如何跟踪您的资源。

Adopting this type of development Environment and Resource Management strategy is key to releasing usable software and managing a break/fix enhancement environment. Even if you're developing for your own edificcation on your own time, you should be using source control.. as you are..

采用这种类型的开发环境和资源管理策略是发布可用软件和管理中断/修复增强环境的关键。即使您是在自己的时间为自己的教育而开发,您也应该使用源代码控制......因为你......

As for Git, I would find a gui front end or a dev env integration if I can. Git is pretty big so I am sure it has plenty of front end support, maybe?

至于 Git,如果可以的话,我会找到 gui 前端或 dev env 集成。Git 非常大,所以我相信它有很多前端支持,也许吧?

回答by Arnold Daniels

(shameless plug)

(无耻的塞子)

The dbvc commandline toolallows you to manage your database schema updates in your repository.

dbvc命令行工具可以让你在你的仓库管理数据库架构更新。

It creates and uses a table _dbvcin the database which holds a list of the updates that are run. You can easily run the updates that haven't been apply to your database schema yet.

_dbvc在数据库中创建并使用一个表,该表包含正在运行的更新列表。您可以轻松运行尚未应用于您的数据库架构的更新。

The tool uses git to determine the correct order of executing the updates.

该工具使用 git 来确定执行更新的正确顺序。

DBVC usage

DBVC 使用

Show a list of commands

显示命令列表

dbvc help

Show help on a specific command

显示特定命令的帮助

dbvc help init

Initialise DBVC for an existing database.

为现有数据库初始化 DBVC。

dbvc init

Create a database dump. This is used to create the DB on a new environment.

创建数据库转储。这用于在新环境中创建数据库。

mysqldump foobar > dev/schema.php

Create the DB using the schema.

使用架构创建数据库。

dbvc create

Add an update file. These are used to update the DB on other environments.

添加更新文件。这些用于更新其他环境中的数据库。

echo 'ALTER TABLE `foo` ADD COLUMN `status` BOOL DEFAULT 1;' > dev/updates/add-status-to-foo.sql

Mark an update as already run.

将更新标记为已运行。

dbvc mark add-status-to-foo

Show a list of updates that need to be run.

显示需要运行的更新列表。

dbvc status

Show all updates with their status.

显示所有更新及其状态。

dbvc status --all

Update the database.

更新数据库。

dbvc update

回答by phil294

I have found the following options to be mandatory for a version control / git-compatible mysqldump.

我发现以下选项对于版本控制/与 git 兼容的 mysqldump 是必需的。

mysqldump --skip-opt --skip-comments |sed -e 's/DEFINER[ ]*=[ ]*[^*]*\*/\*/'

(and maybe --no-data)

(也许--no-data

--skip-optis very useful, it takes away all of --add-drop-table --add-locks --create-options --disable-keys --extended-insert --lock-tables --quick --set-charset. The DEFINERsed is necessary when the database contains triggers.

--skip-opt非常有用,它带走了所有--add-drop-table --add-locks --create-options --disable-keys --extended-insert --lock-tables --quick --set-charset. 在DEFINERSED是必要的,当数据库包含触发器。