将子目录分离(移动)到单独的 Git 存储库中
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/359424/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Detach (move) subdirectory into separate Git repository
提问by matli
I have a Gitrepository which contains a number of subdirectories. Now I have found that one of the subdirectories is unrelated to the other and should be detached to a separate repository.
我有一个包含许多子目录的Git存储库。现在我发现其中一个子目录与另一个子目录无关,应该分离到一个单独的存储库。
How can I do this while keeping the history of the files within the subdirectory?
如何在保持子目录中文件的历史记录的同时做到这一点?
I guess I could make a clone and remove the unwanted parts of each clone, but I suppose this would give me the complete tree when checking out an older revision etc. This might be acceptable, but I would prefer to be able to pretend that the two repositories doesn't have a shared history.
我想我可以制作一个克隆并删除每个克隆不需要的部分,但我想这会在检查旧版本等时为我提供完整的树。这可能是可以接受的,但我更愿意能够假装两个存储库没有共享历史记录。
Just to make it clear, I have the following structure:
为了清楚起见,我有以下结构:
XYZ/
.git/
XY1/
ABC/
XY2/
But I would like this instead:
但我想要这个:
XYZ/
.git/
XY1/
XY2/
ABC/
.git/
ABC/
采纳答案by Paul
Update: This process is so common, that the git team made it much simpler with a new tool, git subtree
. See here: Detach (move) subdirectory into separate Git repository
更新:这个过程非常普遍,所以 git 团队使用新工具git subtree
. 请参阅此处:将子目录分离(移动)到单独的 Git 存储库中
You want to clone your repository and then use git filter-branch
to mark everything but the subdirectory you want in your new repo to be garbage-collected.
您想克隆您的存储库,然后使用它git filter-branch
来标记除新存储库中要进行垃圾收集的子目录之外的所有内容。
To clone your local repository:
git clone /XYZ /ABC
(Note: the repository will be cloned using hard-links, but that is not a problem since the hard-linked files will not be modified in themselves - new ones will be created.)
Now, let us preserve the interesting branches which we want to rewrite as well, and then remove the origin to avoid pushing there and to make sure that old commits will not be referenced by the origin:
cd /ABC for i in branch1 br2 br3; do git branch -t $i origin/$i; done git remote rm origin
or for all remote branches:
cd /ABC for i in $(git branch -r | sed "s/.*origin\///"); do git branch -t $i origin/$i; done git remote rm origin
Now you might want to also remove tags which have no relation with the subproject; you can also do that later, but you might need to prune your repo again. I did not do so and got a
WARNING: Ref 'refs/tags/v0.1' is unchanged
for all tags (since they were all unrelated to the subproject); additionally, after removing such tags more space will be reclaimed. Apparentlygit filter-branch
should be able to rewrite other tags, but I could not verify this. If you want to remove all tags, usegit tag -l | xargs git tag -d
.Then use filter-branch and reset to exclude the other files, so they can be pruned. Let's also add
--tag-name-filter cat --prune-empty
to remove empty commits and to rewrite tags (note that this will have to strip their signature):git filter-branch --tag-name-filter cat --prune-empty --subdirectory-filter ABC -- --all
or alternatively, to only rewrite the HEAD branch and ignore tags and other branches:
git filter-branch --tag-name-filter cat --prune-empty --subdirectory-filter ABC HEAD
Then delete the backup reflogs so the space can be truly reclaimed (although now the operation is destructive)
git reset --hard git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d git reflog expire --expire=now --all git gc --aggressive --prune=now
and now you have a local git repository of the ABC sub-directory with all its history preserved.
要克隆本地存储库:
git clone /XYZ /ABC
(注意:存储库将使用硬链接克隆,但这不是问题,因为硬链接文件本身不会被修改 - 将创建新文件。)
现在,让我们保留我们想要重写的有趣分支,然后删除原点以避免推送到那里并确保原点不会引用旧提交:
cd /ABC for i in branch1 br2 br3; do git branch -t $i origin/$i; done git remote rm origin
或者对于所有远程分支:
cd /ABC for i in $(git branch -r | sed "s/.*origin\///"); do git branch -t $i origin/$i; done git remote rm origin
现在您可能还想删除与子项目无关的标签;你也可以稍后再做,但你可能需要再次修剪你的回购。我没有这样做,而是得到了
WARNING: Ref 'refs/tags/v0.1' is unchanged
所有标签(因为它们都与子项目无关);此外,删除此类标签后,将回收更多空间。显然git filter-branch
应该能够重写其他标签,但我无法验证这一点。如果要删除所有标签,请使用git tag -l | xargs git tag -d
.然后使用 filter-branch 和 reset 排除其他文件,以便可以修剪它们。让我们还添加
--tag-name-filter cat --prune-empty
删除空提交和重写标签(请注意,这将不得不剥离他们的签名):git filter-branch --tag-name-filter cat --prune-empty --subdirectory-filter ABC -- --all
或者,只重写 HEAD 分支并忽略标签和其他分支:
git filter-branch --tag-name-filter cat --prune-empty --subdirectory-filter ABC HEAD
然后删除备份的reflogs,这样空间才能真正被回收(虽然现在操作是破坏性的)
git reset --hard git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d git reflog expire --expire=now --all git gc --aggressive --prune=now
现在您拥有 ABC 子目录的本地 git 存储库,并保留了其所有历史记录。
Note: For most uses, git filter-branch
should indeed have the added parameter -- --all
. Yes that's really --space--all
. This needs to be the last parameters for the command. As Matli discovered, this keeps the project branches and tags included in the new repo.
注意:对于大多数用途,git filter-branch
确实应该添加参数-- --all
。是的,确实如此--space--all
。这需要是命令的最后一个参数。正如 Matli 发现的那样,这将保留项目分支和标签包含在新的存储库中。
Edit: various suggestions from comments below were incorporated to make sure, for instance, that the repository is actually shrunk (which was not always the case before).
编辑:整合了来自以下评论的各种建议,以确保例如存储库实际上缩小了(以前并非总是如此)。
回答by CoolAJ86
The Easy Way™
简单的方法™
It turns out that this is such a common and useful practice that the overlords of Git made it really easy, but you have to have a newer version of Git (>= 1.7.11 May 2012). See the appendixfor how to install the latest Git. Also, there's a real-world examplein the walkthroughbelow.
事实证明,这是一种非常常见且有用的实践,Git 的霸主让它变得非常简单,但您必须拥有更新版本的 Git(>= 2012 年 5 月 11 日 1.7.11)。请参阅附录了解如何安装最新的 Git。此外,下面的演练中还有一个真实示例。
Prepare the old repo
cd <big-repo> git subtree split -P <name-of-folder> -b <name-of-new-branch>
Note:
<name-of-folder>
must NOT contain leading or trailing characters. For instance, the folder namedsubproject
MUST be passed assubproject
, NOT./subproject/
Note for Windows users:When your folder depth is > 1,
<name-of-folder>
must have *nix style folder separator (/). For instance, the folder namedpath1\path2\subproject
MUST be passed aspath1/path2/subproject
Create the new repo
mkdir ~/<new-repo> && cd ~/<new-repo> git init git pull </path/to/big-repo> <name-of-new-branch>
Link the new repo to GitHub or wherever
git remote add origin <[email protected]:user/new-repo.git> git push -u origin master
Cleanup inside
<big-repo>
, if desiredgit rm -rf <name-of-folder>
Note: This leaves all the historical references in the repository.See the Appendixbelow if you're actually concerned about having committed a password or you need to decreasing the file size of your
.git
folder.
准备旧仓库
cd <big-repo> git subtree split -P <name-of-folder> -b <name-of-new-branch>
注意:
<name-of-folder>
不得包含前导或尾随字符。例如,名为的文件夹subproject
必须作为 传递subproject
,而不是./subproject/
Windows 用户注意:当您的文件夹深度 > 1 时,
<name-of-folder>
必须有 *nix 样式的文件夹分隔符 (/)。例如,名为的文件夹path1\path2\subproject
必须作为path1/path2/subproject
创建新的存储库
mkdir ~/<new-repo> && cd ~/<new-repo> git init git pull </path/to/big-repo> <name-of-new-branch>
将新的 repo 链接到 GitHub 或任何地方
git remote add origin <[email protected]:user/new-repo.git> git push -u origin master
如果需要
<big-repo>
,清理内部git rm -rf <name-of-folder>
注意:这会在存储库中保留所有历史参考。如果您确实担心提交密码或需要减小文件
.git
夹的文件大小,请参阅下面的附录。
...
...
Walkthrough
演练
These are the same steps as above, but following my exact steps for my repository instead of using <meta-named-things>
.
这些步骤与上面相同,但遵循我对存储库的确切步骤,而不是使用<meta-named-things>
.
Here's a project I have for implementing JavaScript browser modules in node:
这是我在 node 中实现 JavaScript 浏览器模块的项目:
tree ~/node-browser-compat
node-browser-compat
├── ArrayBuffer
├── Audio
├── Blob
├── FormData
├── atob
├── btoa
├── location
└── navigator
I want to split out a single folder, btoa
, into a separate Git repository
我想将一个文件夹拆分btoa
成一个单独的 Git 存储库
cd ~/node-browser-compat/
git subtree split -P btoa -b btoa-only
I now have a new branch, btoa-only
, that only has commits for btoa
and I want to create a new repository.
我现在有一个新分支,btoa-only
只有提交btoa
,我想创建一个新的存储库。
mkdir ~/btoa/ && cd ~/btoa/
git init
git pull ~/node-browser-compat btoa-only
Next I create a new repo on GitHub or Bitbucket, or whatever and add it as the origin
接下来,我在 GitHub 或 Bitbucket 或其他任何地方创建一个新的存储库并将其添加为 origin
git remote add origin [email protected]:node-browser-compat/btoa.git
git push -u origin master
Happy day!
愉快的一天!
Note:If you created a repo with a README.md
, .gitignore
and LICENSE
, you will need to pull first:
注意:如果您使用README.md
,.gitignore
和创建了一个存储库LICENSE
,则需要先拉取:
git pull origin master
git push origin master
Lastly, I'll want to remove the folder from the bigger repo
最后,我想从更大的 repo 中删除该文件夹
git rm -rf btoa
...
...
Appendix
附录
Latest Git on macOS
macOS 上的最新 Git
To get the latest version of Git using Homebrew:
要使用Homebrew获取最新版本的 Git :
brew install git
Latest Git on Ubuntu
Ubuntu 上的最新 Git
sudo apt-get update
sudo apt-get install git
git --version
If that doesn't work (you have a very old version of Ubuntu), try
如果这不起作用(您有一个非常旧版本的 Ubuntu),请尝试
sudo add-apt-repository ppa:git-core/ppa
sudo apt-get update
sudo apt-get install git
If that still doesn't work, try
如果这仍然不起作用,请尝试
sudo chmod +x /usr/share/doc/git/contrib/subtree/git-subtree.sh
sudo ln -s \
/usr/share/doc/git/contrib/subtree/git-subtree.sh \
/usr/lib/git-core/git-subtree
Thanks to rui.araujo from the comments.
感谢评论中的 rui.araujo。
Clearing your history
清除您的历史记录
By default removing files from Git doesn't actually remove them, it just commits that they aren't there anymore. If you want to actually remove the historical references (i.e. you have a committed a password), you need to do this:
默认情况下,从 Git 中删除文件实际上并没有删除它们,它只是提交它们不再存在。如果你想真正删除历史引用(即你有一个提交的密码),你需要这样做:
git filter-branch --prune-empty --tree-filter 'rm -rf <name-of-folder>' HEAD
After that you can check that your file or folder no longer shows up in the Git history at all
之后,您可以检查您的文件或文件夹是否根本不再出现在 Git 历史记录中
git log -- <name-of-folder> # should show nothing
However, you can't "push" deletes to GitHuband the like. If you try you'll get an error and you'll have to git pull
before you can git push
- and then you're back to having everything in your history.
但是,您不能将删除“推送”到 GitHub等。如果你尝试你会得到一个错误,你必须git pull
在你可以之前git push
- 然后你回到拥有历史中的一切。
So if you want to delete history from the "origin" - meaning to delete it from GitHub, Bitbucket, etc - you'll need to delete the repo and re-push a pruned copy of the repo. But wait - there's more! - If you're really concerned about getting rid of a password or something like that you'll need to prune the backup (see below).
因此,如果您想从“来源”中删除历史记录 - 意味着从 GitHub、Bitbucket 等中删除它 - 您需要删除存储库并重新推送存储库的修剪副本。但是等等 -还有更多!- 如果您真的担心摆脱密码或类似的东西,您需要修剪备份(见下文)。
Making .git
smaller
制作.git
更小
The aforementioned delete history command still leaves behind a bunch of backup files - because Git is all too kind in helping you to not ruin your repo by accident. It will eventually deleted orphaned files over the days and months, but it leaves them there for a while in case you realize that you accidentally deleted something you didn't want to.
前面提到的删除历史命令仍然会留下一堆备份文件 - 因为 Git 非常友好地帮助您避免意外破坏您的存储库。它最终会在几天和几个月内删除孤立文件,但它会将它们留在那里一段时间,以防您意识到不小心删除了您不想删除的内容。
So if you really want to empty the trashto reduce the clone sizeof a repo immediately you have to do all of this really weird stuff:
因此,如果您真的想立即清空垃圾箱以减小repo的克隆大小,则必须执行所有这些非常奇怪的操作:
rm -rf .git/refs/original/ && \
git reflog expire --all && \
git gc --aggressive --prune=now
git reflog expire --all --expire-unreachable=0
git repack -A -d
git prune
That said, I'd recommend not performing these steps unless you know that you need to - just in case you did prune the wrong subdirectory, y'know? The backup files shouldn't get cloned when you push the repo, they'll just be in your local copy.
也就是说,除非您知道需要这样做,否则我建议不要执行这些步骤 - 以防万一您确实修剪了错误的子目录,您知道吗?当您推送 repo 时,备份文件不应被克隆,它们只会在您的本地副本中。
Credit
信用
回答by pgs
Paul's answercreates a new repository containing /ABC, but does not remove /ABC from within /XYZ. The following command will remove /ABC from within /XYZ:
Paul 的回答创建了一个包含 /ABC 的新存储库,但不会从 /XYZ 中删除 /ABC。以下命令将从 /XYZ 中删除 /ABC:
git filter-branch --tree-filter "rm -rf ABC" --prune-empty HEAD
Of course, test it in a 'clone --no-hardlinks' repository first, and follow it with the reset, gc and prune commands Paul lists.
当然,首先在“clone --no-hardlinks”存储库中对其进行测试,然后使用 Paul 列出的 reset、gc 和 prune 命令进行跟踪。
回答by Josh Lee
I've found that in order to properly delete the old history from the new repository, you have to do a little more work after the filter-branch
step.
我发现为了从新存储库中正确删除旧历史记录,您必须在该filter-branch
步骤之后做更多的工作。
Do the clone and the filter:
git clone --no-hardlinks foo bar; cd bar git filter-branch --subdirectory-filter subdir/you/want
Remove every reference to the old history. “origin” was keeping track of your clone, and “original” is where filter-branch saves the old stuff:
git remote rm origin git update-ref -d refs/original/refs/heads/master git reflog expire --expire=now --all
Even now, your history might be stuck in a packfile that fsck won't touch. Tear it to shreds, creating a new packfile and deleting the unused objects:
git repack -ad
执行克隆和过滤器:
git clone --no-hardlinks foo bar; cd bar git filter-branch --subdirectory-filter subdir/you/want
删除对旧历史的所有引用。“origin” 是跟踪你的克隆,“original” 是 filter-branch 保存旧东西的地方:
git remote rm origin git update-ref -d refs/original/refs/heads/master git reflog expire --expire=now --all
即使是现在,您的历史记录也可能停留在 fsck 不会触及的包文件中。将其撕成碎片,创建一个新的包文件并删除未使用的对象:
git repack -ad
There is an explanation of thisin the manual for filter-branch.
filter-branch的手册中有对此的解释。
回答by Simon A. Eugster
Edit: Bash script added.
编辑:添加了 Bash 脚本。
The answers given here worked just partially for me; Lots of big files remained in the cache. What finally worked (after hours in #git on freenode):
这里给出的答案只是部分地对我有用;许多大文件保留在缓存中。什么最终有效(在 freenode 上的 #git 下班后):
git clone --no-hardlinks file:///SOURCE /tmp/blubb
cd blubb
git filter-branch --subdirectory-filter ./PATH_TO_EXTRACT --prune-empty --tag-name-filter cat -- --all
git clone file:///tmp/blubb/ /tmp/blooh
cd /tmp/blooh
git reflog expire --expire=now --all
git repack -ad
git gc --prune=now
With the previous solutions, the repository size was around 100 MB. This one brought it down to 1.7 MB. Maybe it helps somebody :)
对于以前的解决方案,存储库大小约为 100 MB。这将其降低到 1.7 MB。也许它可以帮助某人:)
The following bash script automates the task:
以下 bash 脚本会自动执行该任务:
!/bin/bash
if (( $# < 3 ))
then
echo "Usage: git filter-branch --prune-empty --subdirectory-filter <YOUR_SUBDIR_TO_KEEP> master
git push <MY_NEW_REMOTE_URL> -f .
</path/to/repo/> <directory/to/extract/> <newName>"
echo
echo "Example: pushd <big-repo>
git filter-branch --tree-filter "mkdir <name-of-folder>; mv <sub1> <sub2> <name-of-folder>/" HEAD
git subtree split -P <name-of-folder> -b <name-of-new-branch>
popd
/Projects/42.git first/answer/ firstAnswer"
exit 1
fi
clone=/tmp/Clone
newN=/tmp/
git clone --no-hardlinks file:// ${clone}
cd ${clone}
git filter-branch --subdirectory-filter --prune-empty --tag-name-filter cat -- --all
git clone file://${clone} ${newN}
cd ${newN}
git reflog expire --expire=now --all
git repack -ad
git gc --prune=now
回答by jeremyjjbrown
This is no longer so complex you can just use the git filter-branchcommand on a clone of you repo to cull the subdirectories you don't want and then push to the new remote.
这不再那么复杂,您只需在存储库的克隆上使用git filter-branch命令来剔除您不想要的子目录,然后推送到新的远程。
mkdir <new-repo>
pushd <new-repo>
git init
git pull </path/to/big-repo> <name-of-new-branch>
回答by D W
Update: The git-subtree module was so useful that the git team pulled it into core and made it git subtree
. See here: Detach (move) subdirectory into separate Git repository
更新: git-subtree 模块非常有用,以至于 git 团队将其拉入核心并制作了它git subtree
。请参阅此处:将子目录分离(移动)到单独的 Git 存储库中
git-subtree may be useful for this
git-subtree 可能对此有用
http://github.com/apenwarr/git-subtree/blob/master/git-subtree.txt(deprecated)
http://github.com/apenwarr/git-subtree/blob/master/git-subtree.txt(已弃用)
http://psionides.jogger.pl/2010/02/04/sharing-code-between-projects-with-git-subtree/
http://psionides.jogger.pl/2010/02/04/sharing-code-between-projects-with-git-subtree/
回答by Anthony O.
Here is a small modification to CoolAJ86's "The Easy Way™" answerin order to split multiple sub folders(let's say sub1
and sub2
) into a new git repository.
这是对CoolAJ86的“The Easy Way™”答案的一个小修改,以便将多个子文件夹(比如sub1
和sub2
)拆分到一个新的 git 存储库中。
The Easy Way™ (multiple sub folders)
Easy Way™(多个子文件夹)
Prepare the old repo
git remote add origin <[email protected]:my-user/new-repo.git> git push origin -u master
Note:
<name-of-folder>
must NOT contain leading or trailing characters. For instance, the folder namedsubproject
MUST be passed assubproject
, NOT./subproject/
Note for windows users:when your folder depth is > 1,
<name-of-folder>
must have *nix style folder separator (/). For instance, the folder namedpath1\path2\subproject
MUST be passed aspath1/path2/subproject
. Moreover don't usemv
command butmove
.Final note:the unique and big difference with the base answer is the second line of the script "
git filter-branch...
"Create the new repo
popd # get out of <new-repo> pushd <big-repo> git rm -rf <name-of-folder>
Link the new repo to Github or wherever
pushd <big-repo> git filter-branch --tree-filter "mkdir <name-of-folder>; mv <sub1> <sub2> <name-of-folder>/" HEAD git subtree split -P <name-of-folder> -b <name-of-new-branch> popd
Cleanup, if desired
mkdir <new-repo> pushd <new-repo> git init git pull </path/to/big-repo> <name-of-new-branch>
Note: This leaves all the historical references in the repository.See the Appendixin the original answer if you're actually concerned about having committed a password or you need to decreasing the file size of your
.git
folder.
准备旧仓库
git remote add origin <[email protected]:my-user/new-repo.git> git push origin -u master
注意:
<name-of-folder>
不得包含前导或尾随字符。例如,名为的文件夹subproject
必须作为 传递subproject
,而不是./subproject/
Windows 用户注意:当您的文件夹深度 > 1 时,
<name-of-folder>
必须有 *nix 样式的文件夹分隔符 (/)。例如,名为的文件夹path1\path2\subproject
必须作为path1/path2/subproject
. 此外不要使用mv
命令而是move
.最后说明:与基本答案的独特而大的区别是脚本的第二行“
git filter-branch...
”创建新的存储库
popd # get out of <new-repo> pushd <big-repo> git rm -rf <name-of-folder>
将新的 repo 链接到 Github 或任何地方
$ mkdir ...ABC.git $ cd ...ABC.git $ git init --bare
清理,如果需要
$ git push ...ABC.git HEAD
注意:这将保留存储库中的所有历史参考。如果您确实担心提交密码或需要减小文件夹的文件大小,请参阅原始答案中的附录
.git
。
回答by MM.
The original question wants XYZ/ABC/(*files) to become ABC/ABC/(*files). After implementing the accepted answer for my own code, I noticed that it actually changes XYZ/ABC/(*files) into ABC/(*files). The filter-branch man page even says,
原始问题希望 XYZ/ABC/(*files) 变成 ABC/ABC/(*files)。在为我自己的代码实现可接受的答案后,我注意到它实际上将 XYZ/ABC/(*files) 更改为 ABC/(*files)。filter-branch 手册页甚至说,
The result will contain that directory (and only that) as its project root."
结果将包含该目录(并且仅包含该目录)作为其项目根目录。”
In other words, it promotes the top-level folder "up" one level. That's an important distinction because, for example, in my history I had renamed a top-level folder. By promoting folders "up" one level, git loses continuity at the commit where I did the rename.
换句话说,它将顶级文件夹“向上”提升了一个级别。这是一个重要的区别,因为例如,在我的历史记录中,我重命名了一个顶级文件夹。通过将文件夹“向上”提升一级,git 在我进行重命名的提交中失去了连续性。
My answer to the question then is to make 2 copies of the repository and manually delete the folder(s) you want to keep in each. The man page backs me up with this:
我对这个问题的回答是制作 2 个存储库副本并手动删除要保留在每个副本中的文件夹。手册页支持我:
[...] avoid using [this command] if a simple single commit would suffice to fix your problem
[...] 如果简单的单次提交就足以解决您的问题,请避免使用 [此命令]
回答by Case Larsen
To add to Paul's answer, I found that to ultimately recover space, I have to push HEAD to a clean repository and that trims down the size of the .git/objects/pack directory.
为了补充Paul 的答案,我发现为了最终恢复空间,我必须将 HEAD 推送到一个干净的存储库,并减少 .git/objects/pack 目录的大小。
i.e.
IE
$ git clone ...ABC.git
After the gc prune, also do:
在 gc prune 之后,还要执行以下操作:
$ git clone --no-hardlinks /XYZ /ABC $ git filter-branch --subdirectory-filter ABC HEAD $ git reset --hard $ git push ...ABC.git HEAD
Then you can do
然后你可以做
##代码##and the size of ABC/.git is reduced
并且 ABC/.git 的大小减小
Actually, some of the time consuming steps (e.g. git gc) aren't needed with the push to clean repository, i.e.:
实际上,推送到清理存储库不需要一些耗时的步骤(例如 git gc),即:
##代码##