MySQL 您如何管理开发、测试和生产中的数据库?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6371/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do you manage databases in development, test, and production?
提问by Matt Miller
I've had a hard time trying to find good examples of how to manage database schemas and data between development, test, and production servers.
我很难找到关于如何在开发、测试和生产服务器之间管理数据库模式和数据的好例子。
Here's our setup. Each developer has a virtual machine running our app and the MySQL database. It is their personal sandbox to do whatever they want. Currently, developers will make a change to the SQL schema and do a dump of the database to a text file that they commit into SVN.
这是我们的设置。每个开发人员都有一个运行我们的应用程序和 MySQL 数据库的虚拟机。这是他们的个人沙箱,可以为所欲为。目前,开发人员将对 SQL 模式进行更改,并将数据库转储到他们提交到 SVN 的文本文件中。
We're wanting to deploy a continuous integration development server that will always be running the latest committed code. If we do that now, it will reload the database from SVN for each build.
我们想要部署一个持续集成开发服务器,该服务器将始终运行最新提交的代码。如果我们现在这样做,它将为每个构建从 SVN 重新加载数据库。
We have a test (virtual) server that runs "release candidates." Deploying to the test server is currently a very manual process, and usually involves me loading the latest SQL from SVN and tweaking it. Also, the data on the test server is inconsistent. You end up with whatever test data the last developer to commit had on his sandbox server.
我们有一个运行“候选发布”的测试(虚拟)服务器。部署到测试服务器目前是一个非常手动的过程,通常需要我从 SVN 加载最新的 SQL 并对其进行调整。另外,测试服务器上的数据不一致。您最终会得到最后一个开发人员在他的沙箱服务器上提交的任何测试数据。
Where everything breaks down is the deployment to production. Since we can't overwrite the live data with test data, this involves manually re-creating all the schema changes. If there were a large number of schema changes or conversion scripts to manipulate the data, this can get really hairy.
一切都失败的地方是部署到生产。由于我们不能用测试数据覆盖实时数据,这涉及手动重新创建所有架构更改。如果有大量模式更改或转换脚本来操作数据,这会变得非常麻烦。
If the problem was just the schema, It'd be an easier problem, but there is "base" data in the database that is updated during development as well, such as meta-data in security and permissions tables.
如果问题只是架构,这将是一个更简单的问题,但数据库中有“基础”数据也在开发过程中更新,例如安全和权限表中的元数据。
This is the biggest barrier I see in moving toward continuous integration and one-step-builds. How do yousolve it?
这是我在向持续集成和一步构建迈进的过程中看到的最大障碍。如何你解决呢?
A follow-up question: how do you track database versions so you know which scripts to run to upgrade a given database instance? Is a version table like Lance mentions below the standard procedure?
一个后续问题:如何跟踪数据库版本,以便知道要运行哪些脚本来升级给定的数据库实例?标准程序下方是否有像 Lance 提到的版本表?
Thanks for the reference to Tarantino. I'm not in a .NET environment, but I found their DataBaseChangeMangement wiki pageto be very helpful. Especially this Powerpoint Presentation (.ppt)
感谢您参考塔伦蒂诺。我不在 .NET 环境中,但我发现他们的DataBaseChangeMangement wiki 页面非常有帮助。特别是这个Powerpoint Presentation (.ppt)
I'm going to write a Python script that checks the names of *.sql
scripts in a given directory against a table in the database and runs the ones that aren't there in order based on a integer that forms the first part of the filename. If it is a pretty simple solution, as I suspect it will be, then I'll post it here.
我将编写一个 Python 脚本,该*.sql
脚本根据数据库中的表检查给定目录中的脚本名称,并根据构成文件名第一部分的整数按顺序运行不存在的脚本。如果这是一个非常简单的解决方案,正如我怀疑的那样,那么我会在这里发布。
I've got a working script for this. It handles initializing the DB if it doesn't exist and running upgrade scripts as necessary. There are also switches for wiping an existing database and importing test data from a file. It's about 200 lines, so I won't post it (though I might put it on pastebin if there's interest).
我有一个工作脚本。如果数据库不存在,它会处理初始化数据库并根据需要运行升级脚本。还有用于擦除现有数据库和从文件导入测试数据的开关。它大约有 200 行,所以我不会发布它(尽管如果有兴趣,我可能会将它放在 pastebin 上)。
采纳答案by Lance Fisher
There are a couple of good options. I wouldn't use the "restore a backup" strategy.
有几个不错的选择。我不会使用“恢复备份”策略。
Script all your schema changes, and have your CI server run those scripts on the database. Have a version table to keep track of the current database version, and only execute the scripts if they are for a newer version.
Use a migration solution. These solutions vary by language, but for .NET I use Migrator.NET. This allows you to version your database and move up and down between versions. Your schema is specified in C# code.
编写所有架构更改的脚本,并让 CI 服务器在数据库上运行这些脚本。有一个版本表来跟踪当前的数据库版本,并且只有在它们是用于较新版本时才执行脚本。
使用迁移解决方案。这些解决方案因语言而异,但对于 .NET,我使用 Migrator.NET。这允许您对数据库进行版本控制并在版本之间上下移动。您的架构是在 C# 代码中指定的。
回答by tbreffni
Your developers need to write change scripts (schema and data change) for each bug/feature they work on, not just simply dump the entire database into source control. These scripts will upgrade the current production database to the new version in development.
您的开发人员需要为他们处理的每个错误/功能编写更改脚本(架构和数据更改),而不仅仅是将整个数据库转储到源代码控制中。这些脚本会将当前的生产数据库升级到开发中的新版本。
Your build process can restore a copy of the production database into an appropriate environment and run all the scripts from source control on it, which will update the database to the current version. We do this on a daily basis to make sure all the scripts run correctly.
您的构建过程可以将生产数据库的副本恢复到适当的环境中,并在其上运行源代码管理中的所有脚本,这会将数据库更新到当前版本。我们每天都这样做,以确保所有脚本都能正确运行。
回答by Juha Syrj?l?
Have a look at how Ruby on Rails does this.
看看 Ruby on Rails 如何做到这一点。
First there are so called migration files, that basically transform database schema and data from version N to version N+1 (or in case of downgrading from version N+1 to N). Database has table which tells current version.
首先是所谓的迁移文件,它们基本上将数据库模式和数据从版本 N 转换到版本 N+1(或者在从版本 N+1 降级到 N 的情况下)。数据库有表,告诉当前版本。
Test databases are always wiped clean before unit-tests and populated with fixed data from files.
测试数据库总是在单元测试之前被清除干净,并填充来自文件的固定数据。
回答by Esko Luontola
The book Refactoring Databases: Evolutionary Database Designmight give you some ideas on how to manage the database. A short version is readable also at http://martinfowler.com/articles/evodb.html
Refactoring Databases: Evolutionary Database Design这本书可能会给你一些关于如何管理数据库的想法。也可以在http://martinfowler.com/articles/evodb.html 上阅读简短版本
In one PHP+MySQL project I've had the database revision number stored in the database, and when the program connects to the database, it will first check the revision. If the program requires a different revision, it will open a page for upgrading the database. Each upgrade is specified in PHP code, which will change the database schema and migrate all existing data.
在一个 PHP+MySQL 项目中,我将数据库版本号存储在数据库中,当程序连接到数据库时,它会首先检查版本。如果程序需要不同的修订版,它将打开一个用于升级数据库的页面。每次升级都在 PHP 代码中指定,这将更改数据库架构并迁移所有现有数据。
回答by Yordan Georgiev
- Name your databases as follows -
dev_<<db>> , tst_<<db>> , stg_<<db>> , prd_<<db>>
(Obviously you never should hardcode db names - Thus you would be able to deploy even the different type of db's on same physical server ( I do not recommend that , but you may have to ... if resources are tight )
- Ensure you would be able to move data between those automatically
- Separate the db creation scripts from the population = It should be always possible to recreate the db from scratch and populate it ( from the old db version or external data source
- do not use hardcode connection strings in the code ( even not in the config files ) - use in the config files connection string templates , which you do populate dynamically , each reconfiguration of the application_layer which does need recompile is BAD
- do use database versioning and db objects versioning - if you can afford it use ready products , if not develop something on your own
- track each DDL change and save it into some history table ( example here)
- DAILY backups ! Test how fast you would be able to restore something lost from a backup (use automathic restore scripts
- even your DEV database and the PROD have exactly the same creation script you will have problems with the data, so allow developers to create the exact copy of prod and play with it ( I know I will receive minuses for this one , but change in the mindset and the business process will cost you much less when shit hits the fan - so force the coders to subscript legally whatever it makes , but ensure this one
- 如下命名你的数据库 -
dev_<<db>> , tst_<<db>> , stg_<<db>> , prd_<<db>>
(显然你永远不应该硬编码数据库名称 - 因此,您甚至可以在同一物理服务器上部署不同类型的数据库(我不建议这样做,但您可能必须......如果资源紧张)
- 确保您能够在这些数据之间自动移动数据
- 将数据库创建脚本与人口分开 = 应该总是可以从头开始重新创建数据库并填充它(从旧的数据库版本或外部数据源
- 不要在代码中使用硬编码连接字符串(即使不在配置文件中)-在配置文件连接字符串模板中使用,您确实动态填充,需要重新编译的 application_layer 的每次重新配置都是错误的
- 确实使用数据库版本控制和 db 对象版本控制-如果您负担得起,请使用现成的产品,如果不能自己开发一些东西
- 跟踪每个 DDL 更改并将其保存到某个历史记录表中(示例在这里)
- 每日备份!测试您能够以多快的速度从备份中恢复丢失的内容(使用自动恢复脚本
- 即使您的 DEV 数据库和 PROD 具有完全相同的创建脚本,您也会遇到数据问题,因此允许开发人员创建 prod 的精确副本并使用它(我知道我会收到此错误,但更改当狗屎击中粉丝时,心态和业务流程将花费你少得多 - 所以强迫编码员合法地下标,无论它做什么,但要确保这一点。
回答by Rad
You could also look at using a tool like SQL Compareto script the difference between various versions of a database, allowing you to quickly migrate between versions
您还可以考虑使用SQL Compare 之类的工具来编写不同版本数据库之间的差异的脚本,从而使您可以在版本之间快速迁移
回答by Matt Stine
This is something that I'm constantly unsatisfied with - our solution to this problem that is. For several years we maintained a separate change script for each release. This script would contain the deltas from the last production release. With each release of the application, the version number would increment, giving something like the following:
这是我一直不满意的事情 - 我们对这个问题的解决方案。几年来,我们为每个版本维护了一个单独的更改脚本。此脚本将包含上一个生产版本的增量。随着应用程序的每次发布,版本号都会增加,如下所示:
- dbChanges_1.sql
- dbChanges_2.sql
- ...
- dbChanges_n.sql
- dbChanges_1.sql
- dbChanges_2.sql
- ...
- dbChanges_n.sql
This worked well enough until we started maintaining two lines of development: Trunk/Mainline for new development, and a maintenance branch for bug fixes, short term enhancements, etc. Inevitably, the need arose to make changes to the schema in the branch. At this point, we already had dbChanges_n+1.sql in the Trunk, so we ended up going with a scheme like the following:
这很有效,直到我们开始维护两条开发线:用于新开发的主干/主线,以及用于错误修复、短期增强等的维护分支。不可避免地,需要对分支中的架构进行更改。此时,我们已经在 Trunk 中拥有 dbChanges_n+1.sql,因此我们最终采用了如下方案:
- dbChanges_n.1.sql
- dbChanges_n.2.sql
- ...
- dbChanges_n.3.sql
- dbChanges_n.1.sql
- dbChanges_n.2.sql
- ...
- dbChanges_n.3.sql
Again, this worked well enough, until we one day we looked up and saw 42 delta scripts in the mainline and 10 in the branch. ARGH!
同样,这也很有效,直到有一天我们抬头看到主线中有 42 个增量脚本,分支中有 10 个。啊!
These days we simply maintain one delta script and let SVN version it - i.e. we overwrite the script with each release. And we shy away from making schema changes in branches.
这些天我们只是维护一个增量脚本并让 SVN 版本它 - 即我们在每个版本中覆盖脚本。我们避免在分支中进行架构更改。
So, I'm not satisfied with this either. I really like the concept of migrations from Rails. I've become quite fascinated with LiquiBase. It supports the concept of incremental database refactorings. It's worth a look and I'll be looking at it in detail soon. Anybody have experience with it? I'd be very curious to hear about your results.
所以,我对此也不满意。我真的很喜欢从 Rails 迁移的概念。我对LiquiBase非常着迷。它支持增量数据库重构的概念。值得一看,我很快就会详细介绍。有人有经验吗?我很想知道你的结果。
回答by Tim Williscroft
We have a very similar setup to the OP.
我们的设置与 OP 非常相似。
Developers develop in VM's with private DB's.
开发人员在具有私有数据库的 VM 中进行开发。
[Developers will soon be committing into private branches]
【开发商即将入驻私人分支机构】
Testing is run on different machines ( actually in in VM's hosted on a server) [Will soon be run by Hudson CI server]
测试在不同的机器上运行(实际上是在托管在服务器上的虚拟机中)[将很快由 Hudson CI 服务器运行]
Test by loading the reference dump into the db. Apply the developers schema patches then apply the developers data patches
通过将参考转储加载到数据库中进行测试。应用开发人员架构补丁,然后应用开发人员数据补丁
Then run unit and system tests.
然后运行单元和系统测试。
Production is deployed to customers as installers.
生产作为安装人员部署给客户。
What we do:
我们所做的:
We take a schema dump of our sandbox DB. Then a sql data dump. We diff that to the previous baseline. that pair of deltas is to upgrade n-1 to n.
我们对我们的沙箱数据库进行模式转储。然后是一个sql数据转储。我们将其与之前的基线进行比较。那对 deltas 是将 n-1 升级到 n。
we configure the dumps and deltas.
我们配置转储和增量。
So to install version N CLEAN we run the dump into an empty db. To patch, apply the intervening patches.
所以为了安装 N CLEAN 版本,我们将转储运行到一个空的数据库中。要打补丁,请应用中间的补丁。
( Juha mentioned Rail's idea of having a table recording the current DB version is a good one and should make installing updates less fraught. )
( Juha 提到 Rail 有一个记录当前 DB 版本的表的想法是一个很好的想法,应该使安装更新不那么令人担忧。)
Deltas and dumps have to be reviewed before beta test. I can't see any way around this as I've seen developers insert test accounts into the DB for themselves.
Delta 和转储必须在 Beta 测试之前进行。我无法解决这个问题,因为我已经看到开发人员为自己将测试帐户插入到数据库中。
回答by MarkR
I'm afraid I'm in agreement with other posters. Developers need to script their changes.
恐怕我同意其他海报。开发人员需要编写他们的更改脚本。
In many cases a simple ALTER TABLE won't work, you need to modify existing data too - developers need to thing about what migrations are required and make sure they're scripted correctly (of course you need to test this carefully at some point in the release cycle).
在许多情况下,简单的 ALTER TABLE 行不通,您也需要修改现有数据 - 开发人员需要考虑需要哪些迁移并确保它们的脚本正确(当然,您需要在某些时候仔细测试)发布周期)。
Moreover, if you have any sense, you'll get your developers to script rollbacks for their changes as well so they can be reverted if need be. This should be tested as well, to ensure that their rollback not only executes without error, but leaves the DB in the same state as it was in previously (this is not always possible or desirable, but is a good rule most of the time).
此外,如果您有任何感觉,您将让您的开发人员为他们的更改编写回滚脚本,以便在需要时可以恢复它们。这也应该被测试,以确保它们的回滚不仅没有错误地执行,而且使数据库处于与以前相同的状态(这并不总是可能或可取的,但在大多数情况下是一个很好的规则) .
How you hook that into a CI server, I don't know. Perhaps your CI server needs to have a known build snapshot on, which it reverts to each night and then applies all the changes since then. That's probably best, otherwise a broken migration script will break not just that night's build, but all subsequent ones.
我不知道你如何将它连接到 CI 服务器。也许您的 CI 服务器需要有一个已知的构建快照,它会在每晚恢复到该快照,然后应用此后的所有更改。这可能是最好的,否则一个损坏的迁移脚本不仅会破坏当晚的构建,还会破坏所有后续的构建。