Ruby-on-rails 如何使用 Git 分支和 Rails 迁移

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4735058/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 00:08:44  来源:igfitidea点击:

How to work with Git branches and Rails migrations

ruby-on-railsdatabasegit

提问by Kostas

I am working on a rails app with quite a few git branches and many of them include db migrations. We try to be careful but occasionally some piece of code in master asks for a column that got removed/renamed in another branch.

我正在开发一个带有很多 git 分支的 rails 应用程序,其中许多都包括数据库迁移。我们尽量小心,但有时 master 中的一些代码会要求在另一个分支中删除/重命名一个列。

  1. What would be a nice solution to "couple" git branches with DB states?

  2. What would these "states" actually be?

    We can't just duplicate a database if it's a few GBs in size.

  3. And what should happen with merges?

  4. Would the solution translate to noSQL databases as well?

    We currently use MySQL, mongodb and redis

  1. 将 git 分支与 DB 状态“耦合”的好方法是什么?

  2. 这些“状态”究竟是什么?

    如果数据库只有几 GB,我们就不能只是复制它。

  3. 合并应该发生什么?

  4. 该解决方案是否也会转化为 noSQL 数据库?

    我们目前使用 MySQL、mongodb 和 redis



EDIT: Looks like I forgot to mention a very important point, I am only interested in the development environmentbut with large databases (a few GBs in size).

编辑:看起来我忘了提到一个非常重要的点,我只对开发环境感兴趣,但对大型数据库(大小为几 GB)感兴趣。

回答by Andy Lindeman

When you add a new migration in any branch, run rake db:migrateand commit both the migration anddb/schema.rb

在任何分支中添加新迁移时,运行rake db:migrate并提交迁移db/schema.rb

If you do this, in development, you'll be able to switch to another branch that has a different set of migrations and simply run rake db:schema:load.

如果你这样做,在开发中,你将能够切换到另一个具有不同迁移集的分支,只需运行rake db:schema:load.

Note that this will recreate the entire database, and existing data will be lost.

请注意,这将重新创建整个数据库,并且现有数据将丢失

You'll probably only want to run production off of one branch which you're very careful with, so these steps don't apply there (just run rake db:migrateas usual there). But in development, it should be no big deal to recreate the database from the schema, which is what rake db:schema:loadwill do.

您可能只想在您非常小心的一个分支上运行生产,因此这些步骤不适用于那里(只需在rake db:migrate那里照常运行)。但是在开发中,从模式重新创建数据库应该没什么大不了的,这就是rake db:schema:load会做的。

回答by ndp

If you have a large database that you can't readily reproduce, then I'd recommend using the normal migration tools. If you want a simple process, this is what I'd recommend:

如果您有一个无法轻松重现的大型数据库,那么我建议您使用普通的迁移工具。如果您想要一个简单的过程,这就是我的建议:

  • Before switching branches, rollback (rake db:rollback) to the state before the branch point. Then, after switching branches, run db:migrate. This is mathematically correct, and as long as you write downscripts, it will work.
  • If you forget to do this before switching branches, in general you can safely switch back, rollback, and switch again, so I think as a workflow, it's feasible.
  • If you have dependencies between migrations in different branches... well, you'll have to think hard.
  • 切换分支前,回滚(rake db:rollback)到分支点之前的状态。然后,在切换分支后,运行db:migrate. 这在数学上是正确的,只要您编写down脚本,它就会起作用。
  • 如果你在切换分支之前忘记这样做,一般你可以安全地切换回来,回滚,然后再次切换,所以我认为作为一个工作流,这是可行的。
  • 如果您在不同分支的迁移之间存在依赖关系……好吧,您必须认真考虑。

回答by Jon Lemmon

Here's a script I wrote for switching between branches that contain different migrations:

这是我为在包含不同迁移的分支之间切换而编写的脚本:

https://gist.github.com/4076864

https://gist.github.com/4076864

It won't solve all the problems you mentioned, but given a branch name it will:

它不会解决你提到的所有问题,但给定一个分支名称,它会:

  1. Roll back any migrations on your current branch which do not exist on the given branch
  2. Discard any changes to the db/schema.rb file
  3. Check out the given branch
  4. Run any new migrations existing in the given branch
  5. Update your test database
  1. 回滚当前分支上不存在于给定分支上的任何迁移
  2. 放弃对 db/schema.rb 文件的任何更改
  3. 查看给定的分支
  4. 运行给定分支中现有的任何新迁移
  5. 更新您的测试数据库

I find myself manually doing this all the time on our project, so I thought it'd be nice to automate the process.

我发现自己一直在我们的项目中手动执行此操作,因此我认为自动化该过程会很好。

回答by Joshua Pinter

Separate Database for each Branch

每个分支的独立数据库

It's the only way to fly.

这是唯一的飞行方式。

Update October 16th, 2017

2017 年 10 月 16 日更新

I returned to this after quite some time and made some improvements:

一段时间后我回到了这个并做了一些改进:

  • I've added another namespace rake task to create a branch and clone the database in one fell swoop, with bundle exec rake git:branch.
  • I realize now that cloning from master is not always what you want to do so I made it more explicit that the db:clone_from_branchtask takes a SOURCE_BRANCHand a TARGET_BRANCHenvironment variable. When using git:branchit will automatically use the current branch as the SOURCE_BRANCH.
  • Refactoring and simplification.
  • 我添加了另一个命名空间 rake 任务来创建一个分支并一举克隆数据库,使用bundle exec rake git:branch.
  • 我现在意识到从 master 克隆并不总是你想要做的,所以我更明确地指出db:clone_from_branch任务需要SOURCE_BRANCH一个TARGET_BRANCH环境变量。使用git:branch时会自动使用当前分支作为SOURCE_BRANCH.
  • 重构和简化。

config/database.yml

config/database.yml

And to make it easier on you, here's how you update your database.ymlfile to dynamically determine the database name based on the current branch.

为了让您更轻松,以下是更新database.yml文件以根据当前分支动态确定数据库名称的方法。

<% 
database_prefix = 'your_app_name'
environments    = %W( development test ) 
current_branch  = `git status | head -1`.to_s.gsub('On branch ','').chomp
%>

defaults: &defaults
  pool: 5
  adapter: mysql2
  encoding: utf8
  reconnect: false
  username: root
  password:
  host: localhost

<% environments.each do |environment| %>  

<%= environment %>:
  <<: *defaults
  database: <%= [ database_prefix, current_branch, environment ].join('_') %>
<% end %>

lib/tasks/db.rake

lib/tasks/db.rake

Here's a Rake task to easily clone your database from one branch to another. This takes a SOURCE_BRANCHand a TARGET_BRANCHenvironment variables. Based off of @spalladino's task.

这是一个 Rake 任务,可以轻松地将数据库从一个分支克隆到另一个分支。这需要一个SOURCE_BRANCH和一个TARGET_BRANCH环境变量。基于@spalladino的任务。

namespace :db do

  desc "Clones database from another branch as specified by `SOURCE_BRANCH` and `TARGET_BRANCH` env params."
  task :clone_from_branch do

    abort "You need to provide a SOURCE_BRANCH to clone from as an environment variable." if ENV['SOURCE_BRANCH'].blank?
    abort "You need to provide a TARGET_BRANCH to clone to as an environment variable."   if ENV['TARGET_BRANCH'].blank?

    database_configuration = Rails.configuration.database_configuration[Rails.env]
    current_database_name = database_configuration["database"]

    source_db = current_database_name.sub(CURRENT_BRANCH, ENV['SOURCE_BRANCH'])
    target_db = current_database_name.sub(CURRENT_BRANCH, ENV['TARGET_BRANCH'])

    mysql_opts =  "-u #{database_configuration['username']} "
    mysql_opts << "--password=\"#{database_configuration['password']}\" " if database_configuration['password'].presence

    `mysqlshow #{mysql_opts} | grep "#{source_db}"`
    raise "Source database #{source_db} not found" if $?.to_i != 0

    `mysqlshow #{mysql_opts} | grep "#{target_db}"`
    raise "Target database #{target_db} already exists" if $?.to_i == 0

    puts "Creating empty database #{target_db}"
    `mysql #{mysql_opts} -e "CREATE DATABASE #{target_db}"`

    puts "Copying #{source_db} into #{target_db}"
    `mysqldump #{mysql_opts} #{source_db} | mysql #{mysql_opts} #{target_db}`

  end

end

lib/tasks/git.rake

lib/tasks/git.rake

This task will create a git branch off of the current branch (master, or otherwise), check it out and clone the current branch's database into the new branch's database. It's slick AF.

此任务将从当前分支(主分支或其他分支)创建一个 git 分支,检查它并将当前分支的数据库克隆到新分支的数据库中。这是光滑的AF。

namespace :git do

  desc "Create a branch off the current branch and clone the current branch's database."
  task :branch do 
    print 'New Branch Name: '
    new_branch_name = STDIN.gets.strip 

    CURRENT_BRANCH = `git status | head -1`.to_s.gsub('On branch ','').chomp

    say "Creating new branch and checking it out..."
    sh "git co -b #{new_branch_name}"

    say "Cloning database from #{CURRENT_BRANCH}..."

    ENV['SOURCE_BRANCH'] = CURRENT_BRANCH # Set source to be the current branch for clone_from_branch task.
    ENV['TARGET_BRANCH'] = new_branch_name
    Rake::Task['db:clone_from_branch'].invoke

    say "All done!"
  end

end

Now, all you need to do is run bundle exec git:branch, enter in the new branch name and start killing zombies.

现在,您需要做的就是运行bundle exec git:branch,输入新的分支名称并开始杀死僵尸。

回答by noodl

Perhaps you should take this as a hint that your development database is too big? If you can use db/seeds.rb and a smaller data set for development then your issue can be easily solved by using schema.rb and seeds.rb from the current branch.

也许您应该以此作为您的开发数据库太大的提示?如果您可以使用 db/seeds.rb 和较小的数据集进行开发,那么您的问题可以通过使用当前分支中的 schema.rb 和 seed.rb 轻松解决。

That assumes that your question relates to development; I can't imagine why you'd need to regularly switch branches in production.

假设您的问题与开发有关;我无法想象为什么您需要在生产中定期切换分支。

回答by Tabrez

I was struggling with the same issue. Here is my solution:

我正在努力解决同样的问题。这是我的解决方案:

  1. Make sure that both schema.rb and all migrations are checked in by all developers.

  2. There should be one person/machine for deployments to production. Let's call this machine as the merge-machine. When the changes are pulled to the merge machine, the auto-merge for schema.rb will fail. No issues. Just replace the content with whatever the previous contents for schema.rb was (you can put a copy aside or get it from github if you use it ...).

  3. Here is the important step. The migrations from all developers will now be available in db/migrate folder. Go ahead and run bundle exec rake db:migrate. It will bring the database on the merge machine at par with all changes. It will also regenerate schema.rb.

  4. Commit and push the changes out to all repositories (remotes and individuals, which are remotes too). You should be done!

  1. 确保所有开发人员都签入了 schema.rb 和所有迁移。

  2. 应该有一个人/机器用于部署到生产。我们称这台机器为合并机器。当更改被拉到合并机器时,schema.rb 的自动合并将失败。没有问题。只需将内容替换为之前 schema.rb 的任何内容(如果您使用它,您可以将副本放在一边或从 github 获取它......)。

  3. 这是重要的一步。来自所有开发人员的迁移现在将在 db/migrate 文件夹中可用。继续并运行 bundle exec rake db:migrate。它将使合并机器上的数据库与所有更改保持一致。它还将重新生成 schema.rb。

  4. 提交并将更改推送到所有存储库(远程和个人,它们也是远程)。你应该完成!

回答by Paul Carmody

This is what I have done and I'm not quite sure that I have covered all the bases:

这就是我所做的,我不太确定我已经涵盖了所有基础:

In development (using postgresql):

在开发中(使用 postgresql):

  • sql_dump db_name > tmp/branch1.sql
  • git checkout branch2
  • dropdb db_name
  • createdb db_name
  • psql db_name < tmp/branch2.sql # (from previous branch switch)
  • sql_dump db_name > tmp/branch1.s​​ql
  • git 结帐分支 2
  • dropdb db_name
  • createdb db_name
  • psql db_name < tmp/branch2.sql #(来自之前的分支切换)

This is a lot faster than the rake utilities on a database with about 50K records.

这比具有大约 50K 记录的数据库上的 rake 实用程序快得多。

For production, maintain the master branch as sacrosanct and all migrations are checked in, shema.rb properly merged. Go through your standard upgrade procedure.

对于生产,将 master 分支维护为神圣不可侵犯的,并且所有迁移都被签入,shema.rb 正确合并。完成您的标准升级程序。

回答by JohnO

I totally experience the pita you are having here. As I think about it, the real issue is that all the branches don't have the code to rollback certain branches. I'm in the django world, so I don't know rake that well. I'm toying with the idea that the migrations live in their own repo that doesn't get branched (git-submodule, which I recently learned about). That way all the branches have all the migrations. The sticky part is making sure each branch is restricted to only the migrations they care about. Doing/keeping track of that manually would be a pita and prone to error. But none of the migration tools are built for this. That is the point at which I am without a way forward.

我完全体验到你在这里吃的皮塔饼。在我看来,真正的问题是所有分支都没有回滚某些分支的代码。我在 django 世界,所以我不太了解 rake。我正在考虑迁移存在于他们自己的没有分支的仓库中的想法(git-submodule,我最近了解到)。这样所有的分支都有所有的迁移。棘手的部分是确保每个分支仅限于他们关心的迁移。手动执行/跟踪该操作会很麻烦并且容易出错。但是没有一个迁移工具是为此而构建的。这就是我无路可走的地方。

回答by Adam Dymitruk

You want to preserve a "db environment" per branch. Look at smudge/clean script to point to different instances. If you run out of db instances, have the script spin off a temp instance so when you switch to a new branch, it's already there and just needs to be renamed by the script. DB updates should run just before you execute your tests.

您希望为每个分支保留一个“数据库环境”。查看 smudge/clean 脚本以指向不同的实例。如果您用完了 db 实例,让脚本分离一个临时实例,这样当您切换到一个新分支时,它已经存在,只需要由脚本重命名即可。数据库更新应该在您执行测试之前运行。

Hope this helps.

希望这可以帮助。

回答by Alexander

I would suggest one of two options:

我会建议以下两种选择之一:

Option 1

选项1

  1. Put your data in seeds.rb. A nice option is to create your seed data via FactoryGirl/Fabrication gem. This way you can guarantee that the data is in sync with the code if we assume, that the factories are updated together with the addition/removal of columns.
  2. After switching from one branch to another, run rake db:reset, which effectively drops/creates/seeds the database.
  1. 将您的数据放入seeds.rb. 一个不错的选择是通过 FactoryGirl/Fabrication gem 创建您的种子数据。通过这种方式,您可以保证数据与代码同步,如果我们假设工厂随着列的添加/删除一起更新。
  2. 从一个分支切换到另一个分支后,运行rake db:reset,它有效地删除/创建/播种数据库。

Option 2

选项 2

Manually maintain the states of the database by always running rake db:rollback/rake db:migratebefore/after a branch checkout. The caveat is that all your migrations need to be reversible, otherwise this won't work.

一直运行手动维护数据库的状态rake db:rollback/rake db:migrate分支结账前/后。需要注意的是,您的所有迁移都必须是可逆的,否则这将不起作用。