database 如何将数据库置于 git(版本控制)下?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/846659/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How can I put a database under git (version control)?
提问by hasen
I'm doing a web app, and I need to make a branch for some major changes, the thing is, these changes require changes to the database schema, so I'd like to put the entire database under git as well.
我正在做一个 web 应用程序,我需要为一些重大更改创建一个分支,问题是,这些更改需要更改数据库架构,所以我想将整个数据库也放在 git 下。
How do I do that? is there a specific folder that I can keep under a git repository? How do I know which one? How can I be sure that I'm putting the right folder?
我怎么做?是否有一个特定的文件夹可以保存在 git 存储库下?我怎么知道是哪一个?我如何确定我放置的文件夹是正确的?
I need to be sure, because these changes are not backward compatible; I can't afford to screw up.
我需要确定,因为这些更改不向后兼容;我不能搞砸。
The database in my case is PostgreSQL
我的数据库是 PostgreSQL
Edit:
编辑:
Someone suggested taking backups and putting the backup file under version control instead of the database. To be honest, I find that really hard to swallow.
有人建议进行备份并将备份文件而不是数据库置于版本控制之下。老实说,我觉得这真的很难下咽。
There has to be a better way.
一定有更好的方法。
Update:
更新:
OK, so there' no better way, but I'm still not quite convinced, so I will change the question a bit:
好的,所以没有更好的方法,但我仍然不太相信,所以我会稍微改变一下问题:
I'd like to put the entire database under version control, what database engine can I use so that I can put the actual database under version control instead of its dump?
我想将整个数据库置于版本控制之下,我可以使用什么数据库引擎来将实际数据库置于版本控制下而不是转储?
Would sqlite be git-friendly?
sqlite 对 git 友好吗?
Since this is only the development environment, I can choose whatever database I want.
由于这只是开发环境,我可以选择我想要的任何数据库。
Edit2:
编辑2:
What I really want is not to track my development history, but to be able to switch from my "new radical changes" branch to the "current stable branch" and be able for instance to fix some bugs/issues, etc, with the current stable branch. Such that when I switch branches, the database auto-magically becomes compatible with the branch I'm currently on. I don't really care much about the actual data.
我真正想要的不是跟踪我的开发历史,而是能够从我的“新的根本性变化”分支切换到“当前的稳定分支”,并且能够例如使用当前版本修复一些错误/问题等稳定的分支。这样当我切换分支时,数据库会自动神奇地与我当前所在的分支兼容。我不太关心实际数据。
采纳答案by X-Istence
Take a database dump, and version control that instead. This way it is a flat text file.
进行数据库转储,并对其进行版本控制。这样它就是一个纯文本文件。
Personally I suggest that you keep both a data dump, and a schema dump. This way using diff it becomes fairly easy to see what changed in the schema from revision to revision.
我个人建议您同时保留数据转储和模式转储。通过这种方式,使用 diff 可以很容易地查看从修订版到修订版的架构中发生了什么变化。
If you are making big changes, you should have a secondary database that you make the new schema changes to and not touch the old one since as you said you are making a branch.
如果您要进行重大更改,则应该有一个辅助数据库,您可以对其进行新架构更改,而不是触及旧数据库,因为正如您所说,您正在创建一个分支。
回答by Paul Lindner
Check out Refactoring Databases (http://databaserefactoring.com/) for a bunch of good techniques for maintaining your database in tandem with code changes.
查看 Refactoring Databases ( http://databaserefactoring.com/) 以获取一系列用于在代码更改的同时维护数据库的好技术。
Suffice to say that you're asking the wrong questions. Instead of putting your database into git you should be decomposing your changes into small verifiable steps so that you can migrate/rollback schema changes with ease.
可以说你问错了问题。与其将您的数据库放入 git 中,您应该将更改分解为可验证的小步骤,以便您可以轻松迁移/回滚架构更改。
If you want to have full recoverability you should consider archiving your postgres WAL logs and use the PITR (point in time recovery) to play back/forward transactions to specific known good states.
如果您想获得完全的可恢复性,您应该考虑归档您的 postgres WAL 日志并使用 PITR(时间点恢复)将事务回放/转发到特定的已知良好状态。
回答by hasen
I'm starting to think of a really simple solution, don't know why I didn't think of it before!!
我开始想到一个非常简单的解决方案,不知道为什么我之前没有想到!!
- Duplicate the database, (both the schema and the data).
- In the branch for the new-major-changes, simply change the project configuration to use the new duplicate database.
- 复制数据库(模式和数据)。
- 在 new-major-changes 的分支中,只需更改项目配置即可使用新的重复数据库。
This way I can switch branches without worrying about database schema changes.
这样我就可以切换分支而不必担心数据库架构的变化。
EDIT:
编辑:
By duplicate, I mean create another database with a different name (like my_db_2
); not doing a dump or anything like that.
重复,我的意思是创建另一个具有不同名称的数据库(如my_db_2
);不做转储或类似的事情。
回答by zie
Use something like LiquiBasethis lets you keep revision control of your Liquibase files. you can tag changes for production only, and have lb keep your DB up to date for either production or development, (or whatever scheme you want).
使用像LiquiBase这样的东西,这可以让你保持对 Liquibase 文件的修订控制。您可以仅标记生产的更改,并让 lb 使您的数据库为生产或开发(或您想要的任何方案)保持最新。
回答by Dharmendar Kumar 'DK'
Faced similar need and here is what my research on database version control systems threw up:
面临类似的需求,这是我对数据库版本控制系统的研究结果:
- Sqitch - perl based open source; available for all major databases including PostgreSQL https://github.com/sqitchers/sqitch
- Mahout - only for PostgreSQL; open source database schema version control. https://github.com/cbbrowne/mahout
- Liquibase - another open source db version control sw. free version of Datical. http://www.liquibase.org/index.html
- Datical - commercial version of Liquibase - https://www.datical.com/
- Flyway by BoxFuse - commercial sw. https://flywaydb.org/
- Another open source project https://gitlab.com/depesz/VersioningAuthor provides a guide here: https://www.depesz.com/2010/08/22/versioning/
- Red Gate Change Automation - only for SQL Server. https://www.red-gate.com/products/sql-development/sql-change-automation/
- Sqitch - 基于 perl 的开源;适用于所有主要数据库,包括 PostgreSQL https://github.com/sqitchers/sqitch
- Mahout - 仅适用于 PostgreSQL;开源数据库模式版本控制。 https://github.com/cbbrowne/mahout
- Liquibase - 另一个开源数据库版本控制软件。免费版的 Datical。http://www.liquibase.org/index.html
- Datical - Liquibase 的商业版本 - https://www.datical.com/
- Flyway by BoxFuse - 商业软件。https://flywaydb.org/
- 另一个开源项目https://gitlab.com/depesz/Versioning作者在这里提供了一个指南:https: //www.depesz.com/2010/08/22/versioning/
- Red Gate 变更自动化 - 仅适用于 SQL Server。 https://www.red-gate.com/products/sql-development/sql-change-automation/
回答by Hakan Deryal
There is a great project called Migrations under Doctrine that built just for this purpose.
有一个伟大的项目,叫做 Doctrine 下的迁移,就是为此目的而构建的。
Its still in alpha state and built for php.
它仍处于 alpha 状态并为 php 构建。
http://docs.doctrine-project.org/projects/doctrine-migrations/en/latest/index.html
http://docs.doctrine-project.org/projects/doctrine-migrations/en/latest/index.html
回答by sibaz
I've come across this question, as I've got a similar problem, where something approximating a DB based Directory structure, stores 'files', and I need git to manage it. It's distributed, across a cloud, using replication, hence it's access point will be via MySQL.
我遇到过这个问题,因为我有一个类似的问题,其中一些近似于基于数据库的目录结构,存储“文件”,我需要 git 来管理它。它分布在云中,使用复制,因此它的访问点将通过 MySQL。
The gist of the above answers, seem to similarly suggest an alternative solution to the problem asked, which kind of misses the point, of using Git to manage something in a Database, so I'll attempt to answer that question.
上述答案的要点似乎同样为所问问题提出了一种替代解决方案,这种解决方案忽略了使用 Git 管理数据库中的内容的要点,因此我将尝试回答该问题。
Git is a system, which in essence stores a database of deltas (differences), which can be reassembled, in order, to reproduce a context. The normal usage of git assumes that context is a filesystem, and those deltas are diff's in that file system, but really all git is, is a hierarchical database of deltas (hierarchical, because in most cases each delta is a commit with at least 1 parents, arranged in a tree).
Git 是一个系统,它本质上存储了一个增量(差异)数据库,可以重新组装,以便重现上下文。git 的正常用法假设上下文是一个文件系统,而那些 delta 是该文件系统中的差异,但实际上所有 git 都是一个分层的 delta 数据库(分层,因为在大多数情况下,每个 delta 是一个提交,至少有 1父母,排列在树上)。
As long as you can generate a delta, in theory, git can store it. The problem is normally git expects the context, on which it's generating delta's to be a file system, and similarly, when you checkout a point in the git hierarchy, it expects to generate a filesystem.
只要能生成delta,理论上git就可以存储。问题通常是 git 期望上下文,它在其上生成 delta 是一个文件系统,类似地,当您检出 git 层次结构中的一个点时,它期望生成一个文件系统。
If you want to manage change, in a database, you have 2 discrete problems, and I would address them separately (if I were you). The first is schema, the second is data (although in your question, you state data isn't something you're concerned about). A problem I had in the past, was a Dev and Prod database, where Dev could take incremental changes to the schema, and those changes had to be documented in CVS, and propogated to live, along with additions to one of several 'static' tables. We did that by having a 3rd database, called Cruise, which contained only the static data. At any point the schema from Dev and Cruise could be compared, and we had a script to take the diff of those 2 files and produce an SQL file containing ALTER statements, to apply it. Similarly any new data, could be distilled to an SQL file containing INSERT commands. As long as fields and tables are only added, and never deleted, the process could automate generating the SQL statements to apply the delta.
如果你想在数据库中管理变更,你有两个离散的问题,我会分别解决它们(如果我是你的话)。第一个是架构,第二个是数据(尽管在您的问题中,您声明数据不是您所关心的)。我过去遇到的一个问题是 Dev 和 Prod 数据库,Dev 可以在其中对架构进行增量更改,这些更改必须记录在 CVS 中,并传播到实时,以及对几个“静态”之一的添加表。我们通过使用第三个数据库来做到这一点,称为 Cruise,它只包含静态数据。在任何时候都可以比较 Dev 和 Cruise 的模式,我们有一个脚本来获取这两个文件的差异并生成一个包含 ALTER 语句的 SQL 文件,以应用它。类似的任何新数据,可以提炼为包含 INSERT 命令的 SQL 文件。只要只添加字段和表,从不删除,该过程就可以自动生成 SQL 语句以应用增量。
The mechanism by which git generates deltas is diff
and the mechanism by which it combines 1 or more deltas with a file, is called merge
. If you can come up with a method for diffing and merging from a different context, git should work, but as has been discussed you may prefer a tool that does that for you. My first thought towards solving that is this https://git-scm.com/book/en/v2/Customizing-Git-Git-Configuration#External-Merge-and-Diff-Toolswhich details how to replace git's internal diff and merge tool. I'll update this answer, as I come up with a better solution to the problem, but in my case I expect to only have to manage data changes, in-so-far-as a DB based filestore may change, so my solution may not be exactly what you need.
git 生成 deltasdiff
的机制是,将 1 个或多个 deltas 与文件结合的机制称为merge
. 如果您能想出一种从不同上下文进行差异和合并的方法,git 应该可以工作,但正如已经讨论过的,您可能更喜欢为您执行此操作的工具。我解决这个问题的第一个想法是https://git-scm.com/book/en/v2/Customizing-Git-Git-Configuration#External-Merge-and-Diff-Tools,它详细说明了如何替换 git 的内部差异和合并工具。我会更新这个答案,因为我想出了一个更好的解决方案,但在我的情况下,我希望只需要管理数据更改,就基于数据库的文件存储可能会更改而言,所以我的解决方案可能不是您所需要的。
回答by Tails
- Irmin
- Flur.ee
- Crux DB
- 尔明
- 氟利昂
- 关键数据库
I have been looking for the same feature for Postgres (or SQL databases in general) for a while, but I found no tools to be suitable (simple and intuitive) enough. This is probably due to the binary nature of how data is stored. Kloniosounds ideal but looks dead. Noms DBlooks interesting (and alive). Also take a look at Irmin(OCaml-based with Git-properties).
一段时间以来,我一直在为 Postgres(或一般的 SQL 数据库)寻找相同的功能,但我发现没有足够合适(简单且直观)的工具。这可能是由于数据存储方式的二进制性质。Klonio听起来很理想,但看起来已经死了。Noms DB看起来很有趣(而且还活着)。另请查看Irmin(基于 OCaml 的 Git 属性)。
Though this doesn't answer the question in that it would work with Postgres, check out the Flur.eedatabase. It has a "time-travel" feature that allows you to query the data from an arbitrary point in time. I'm guessing it should be able to work with a "branching" model.
尽管这并不能回答问题,因为它可以与 Postgres 一起使用,但请查看Flur.ee数据库。它具有“时间旅行”功能,允许您从任意时间点查询数据。我猜它应该能够使用“分支”模型。
This database was recently being developed for blockchain-purposes. Due to the nature of blockchains, the data needs to be recorded in increments, which is exactly how git works. They are targeting an open-source release in Q2 2019.
该数据库最近正在为区块链目的而开发。由于区块链的性质,数据需要以增量方式记录,这正是 git 的工作原理。他们的目标是在 2019 年第二季度发布开源版本。
因为每个 Fluree 数据库都是一个区块链,所以它存储了每笔交易的完整历史记录。这是区块链如何确保信息不可变和安全的一部分。
Update: Also check out the Crux database, which can query across the time dimension of inserts, which you could see as 'versions'. Crux seems to be an open-source implementation of the highly appraised Datomic.
更新:还可以查看Crux 数据库,它可以查询插入的时间维度,您可以将其视为“版本”。Crux 似乎是备受好评的 Datomic 的开源实现。
Crux is a bitemporal database that stores transaction time and valid time histories. While a [uni]temporal database enables "time travel" querying through the transactional sequence of database states from the moment of database creation to its current state, Crux also provides "time travel" querying for a discrete valid time axis without unnecessary design complexity or performance impact. This means a Crux user can populate the database with past and future information regardless of the order in which the information arrives, and make corrections to past recordings to build an ever-improving temporal model of a given domain.
Crux 是一个双时态数据库,用于存储交易时间和有效时间历史。虽然 [uni]temporal 数据库支持“时间旅行”查询从数据库创建到当前状态的数据库状态的事务序列,但 Crux 还提供对离散有效时间轴的“时间旅行”查询,而无需不必要的设计复杂性或性能影响。这意味着 Crux 用户可以用过去和未来的信息填充数据库,而不管信息到达的顺序,并对过去的记录进行更正以构建给定域的不断改进的时间模型。
回答by CShark
Take a look at RedGate SQL Source Control.
看看 RedGate SQL 源代码控制。
http://www.red-gate.com/products/sql-development/sql-source-control/
http://www.red-gate.com/products/sql-development/sql-source-control/
This tool is a SQL Server Management Studio snap-in which will allow you to place your database under Source Control with Git.
此工具是一个 SQL Server Management Studio 管理单元,它允许您使用 Git 将数据库置于源代码控制之下。
It's a bit pricey at $495 per user, but there is a 28 day free trial available.
每位用户 495 美元有点贵,但有 28 天的免费试用期。
NOTE I am not affiliated with RedGate in any way whatsoever.
注意我与 RedGate 没有任何关联。
回答by Ciges
I want to make something similar, add my database changes to my version control system.
我想做类似的事情,将我的数据库更改添加到我的版本控制系统。
I am going to follow the ideas in this post from Vladimir Khorikov "Database versioning best practices". In summary i will
我将遵循 Vladimir Khorikov “数据库版本控制最佳实践”这篇文章中的想法。总之我会
- store both its schema and the reference data in a source control system.
- for every modification we will create a separate SQL script with the changes
- 将其架构和参考数据存储在源控制系统中。
- 对于每次修改,我们将创建一个单独的 SQL 脚本,其中包含更改
In case it helps!
万一有帮助!