如何提取 git 子目录并从中创建子模块?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/920165/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 06:31:01  来源:igfitidea点击:

How to extract a git subdirectory and make a submodule out of it?

gitgit-submodules

提问by apenwarr

I started a project some months ago and stored everything within a main directory. In my main directory "Project" there are several subdirectories containing different things: Project/paper contains a document written in LaTeX Project/sourcecode/RailsApp contains my rails app.

几个月前我开始了一个项目,并将所有内容存储在一个主目录中。在我的主目录“Project”中有几个包含不同内容的子目录:Project/paper 包含一个用 LaTeX 编写的文档 Project/sourcecode/RailsApp 包含我的 rails 应用程序。

"Project" is GITified and there have been a lot of commits in both "paper" and "RailsApp" directory. Now, as I'd like to use cruisecontrol.rb for my "RailsApp" I wonder if there is a way to make a submodule out of "RailsApp" without losing the history.

“Project”是 GITified,并且在“paper”和“RailsApp”目录中都有很多提交。现在,由于我想将 Cruisecontrol.rb 用于我的“RailsApp”,我想知道是否有一种方法可以在不丢失历史记录的情况下从“RailsApp”中创建一个子模块。

回答by apenwarr

Nowadays there's a much easier way to do it than manually using git filter-branch: git subtree

现在有一种比手动使用 git filter-branch 更简单的方法:git subtree

Installation

安装

NOTEgit-subtreeis now part of git(if you install contrib) as of 1.7.11, so you might already have it installed. You may check by executing git subtree.

NOTEgit-subtree现在是git(如果你安装 contrib)的一部分,从 1.7.11 开始,所以你可能已经安装了它。您可以通过执行来检查git subtree



To install git-subtree from source (for older versions of git):

从源代码安装 git-subtree(对于旧版本的 git):

git clone https://github.com/apenwarr/git-subtree.git

cd git-subtree
sudo rsync -a ./git-subtree.sh /usr/local/bin/git-subtree

Or if you want the man pages and all

或者如果你想要手册页和所有

make doc
make install

Usage

用法

Split a larger into smaller chunks:

将较大的块拆分为较小的块:

# Go into the project root
cd ~/my-project

# Create a branch which only contains commits for the children of 'foo'
git subtree split --prefix=foo --branch=foo-only

# Remove 'foo' from the project
git rm -rf ./foo

# Create a git repo for 'foo' (assuming we already created it on github)
mkdir foo
pushd foo
git init
git remote add origin [email protected]:my-user/new-project.git
git pull ../ foo-only
git push origin -u master
popd

# Add 'foo' as a git submodule to `my-project`
git submodule add [email protected]:my-user/new-project.git foo

For detailed documentation (man page), please read git-subtree.txt.

有关详细文档(手册页),请阅读git-subtree.txt

回答by Pat Notz

Checkout git filter-branch.

结帐git filter-branch

The Examplessectionof the man page shows how to extract a sub-directory into it's own project while keeping all of it's history and discarding history of other files/directories (just what you're looking for).

手册页的Examples部分显示了如何将子目录提取到它自己的项目中,同时保留它的所有历史记录并丢弃其他文件/目录的历史记录(正是您正在寻找的)。

To rewrite the repository to look as if foodir/had been its project root, and discard all other history:

   git filter-branch --subdirectory-filter foodir -- --all

Thus you can, e.g., turn a library subdirectory into a repository of its own.
Note the --that separates filter-branchoptions from revision options, and the --allto rewrite all branches and tags.

要重写存储库以使其看起来好像foodir/是它的项目根目录,并丢弃所有其他历史记录:

   git filter-branch --subdirectory-filter foodir -- --all

例如,您可以将库子目录转换为它自己的存储库。
请注意--,将filter-branch选项与修订选项分开,并--all重写所有分支和标签。

回答by dbr

One way of doing this is the inverse - remove everything but the file you want to keep.

执行此操作的一种方法是相反的 - 删除除您要保留的文件之外的所有内容。

Basically, make a copyof the repository, then use git filter-branchto remove everything but the file/folders you want to keep.

基本上,制作存储库的副本,然后用于git filter-branch删除除要保留的文件/文件夹之外的所有内容。

For example, I have a project from which I wish to extract the file tvnamer.pyto a new repository:

例如,我有一个项目,我希望从中提取文件tvnamer.py到一个新的存储库:

git filter-branch --tree-filter 'for f in *; do if [ $f != "tvnamer.py" ]; then rm -rf $f; fi; done' HEAD

That uses git filter-branch --tree-filterto go through each commit, run the command and recommit the resulting directories content. This is extremely destructive (so you should only do this on a copy of your repository!), and can take a while (about 1 minute on a repository with 300 commits and about 20 files)

这用于git filter-branch --tree-filter遍历每个提交,运行命令并重新提交生成的目录内容。这是极具破坏性的(因此您应该只在存储库的副本上执行此操作!),并且可能需要一段时间(在具有 300 个提交和大约 20 个文件的存储库上大约需要 1 分钟)

The above command just runs the following shell-script on each revision, which you'd have to modify of course (to make it exclude your sub-directory instead of tvnamer.py):

上面的命令只是在每个修订版上运行以下 shell 脚本,您当然必须修改(以使其排除您的子目录而不是tvnamer.py):

for f in *; do
    if [ $f != "tvnamer.py" ]; then
        rm -rf $f;
    fi;
done

The biggest obvious problem is it leaves all commit messages, even if they are unrelated to the remaining file. The script git-remove-empty-commits, fixes this..

最大的明显问题是它会留下所有提交消息,即使它们与剩余文件无关。脚本git-remove-empty-commits,修复了这个..

git filter-branch --commit-filter 'if [ z = z`git rev-parse ^{tree}` ]; then skip_commit "$@"; else git commit-tree "$@"; fi'

You need to use the -fforce argument run filter-branchagain with anything in refs/original/(which basically a backup)

您需要使用-fforce 参数filter-branch再次运行任何内容refs/original/(基本上是备份)

Of course this will never be perfect, for example if your commit messages mention other files, but it's about as close a git current allows (as far as I'm aware anyway).

当然,这永远不会是完美的,例如,如果您的提交消息提到其他文件,但它与 git current 允许的接近(据我所知)。

Again, only ever run this on a copy of your repository!- but in summary, to remove all files but "thisismyfilename.txt":

同样,只在您的存储库副本上运行它!- 但总而言之,要删除除“thisismyfilename.txt”之外的所有文件:

git filter-branch --tree-filter 'for f in *; do if [ $f != "thisismyfilename.txt" ]; then rm -rf $f; fi; done' HEAD
git filter-branch -f --commit-filter 'if [ z = z`git rev-parse ^{tree}` ]; then skip_commit "$@"; else git commit-tree "$@"; fi'

回答by ShawnFeatherly

Both CoolAJ86and apenwarranswers are very similar. I went back and forth between the two trying to understand bits that were missing from either one. Below is a combination of them.

无论CoolAJ86apenwarr答案都非常相似。我在两者之间来回走动,试图了解其中任何一个缺失的部分。下面是它们的组合。

First navigate Git Bash to the root of the git repo to be split. In my example here that is ~/Documents/OriginalRepo (master)

首先将 Git Bash 导航到要拆分的 git 存储库的根目录。在我这里的例子中~/Documents/OriginalRepo (master)

# move the folder at prefix to a new branch
git subtree split --prefix=SubFolderName/FolderToBeNewRepo --branch=to-be-new-repo

# create a new repository out of the newly made branch
mkdir ~/Documents/NewRepo
pushd ~/Documents/NewRepo
git init
git pull ~/Documents/OriginalRepo to-be-new-repo

# upload the new repository to a place that should be referenced for submodules
git remote add origin [email protected]:myUsername/newRepo.git
git push -u origin master
popd

# replace the folder with a submodule
git rm -rf ./SubFolderName/FolderToBeNewRepo
git submodule add [email protected]:myUsername/newRepo.git SubFolderName/FolderToBeNewRepo
git branch --delete --force to-be-new-repo

Below is a copy of above with the customize-able names replaced and using https instead. Root folder is now ~/Documents/_Shawn/UnityProjects/SoProject (master)

下面是上面的副本,替换了可自定义的名称并使用 https。根文件夹现在是~/Documents/_Shawn/UnityProjects/SoProject (master)

# move the folder at prefix to a new branch
git subtree split --prefix=Assets/SoArchitecture --branch=so-package

# create a new repository out of the newly made branch
mkdir ~/Documents/_Shawn/UnityProjects/SoArchitecture
pushd ~/Documents/_Shawn/UnityProjects/SoArchitecture
git init
git pull ~/Documents/_Shawn/UnityProjects/SoProject so-package

# upload the new repository to a place that should be referenced for submodules
git remote add origin https://github.com/Feddas/SoArchitecture.git
git push -u origin master
popd

# replace the folder with a submodule
git rm -rf ./Assets/SoArchitecture
git submodule add https://github.com/Feddas/SoArchitecture.git
git branch --delete --force so-package

回答by Dietrich Epp

If you want to transfer some subset of files to a new repository but keep the history, you're basically going to end up with a completely new history. The way this would work is basically as follows:

如果您想将某些文件子集传输到新的存储库但保留历史记录,则基本上最终会得到一个全新的历史记录。这将工作的方式基本上如下:

  1. Create new repository.
  2. For each revision of your old repository, merge the changes to your module into the new repository. This will create a "copy" of your existing project history.
  1. 创建新的存储库。
  2. 对于旧存储库的每个修订版,将模块的更改合并到新存储库中。这将创建现有项目历史的“副本”。

It should be somewhat straightforward to automate this if you don't mind writing a small but hairy script. Straightforward, yes, but also painful. People have done history rewriting in Git in the past, you can do a search for that.

如果您不介意编写一个小而多毛的脚本,那么自动执行此操作应该有些简单。直截了当,是的,但也很痛苦。过去人们已经在 Git 中重写了历史,你可以搜索一下。

Alternatively:clone the repository, and delete the paper in the clone, delete the app in the original. This would take one minute, it's guaranteed to work, and you can get back to more important things than trying to purify your git history. And don't worry about the hard drive space taken up by redundant copies of history.

或者:克隆存储库,并删除克隆中的论文,删除原始中的应用程序。这将需要一分钟,它保证有效,并且您可以返回到比尝试净化您的 git 历史更重要的事情。并且不用担心历史的冗余副本占用的硬盘空间。