从 git 存储库中删除旧的提交信息以节省空间
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12865332/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Remove old commit information from a git repository to save space
提问by greggles
I have a repository for storing some large binary files (tifs, jpgs, pdfs) that is growing pretty large. There is also a fair amount of files that are created, removed, and renamed and I don't care about the individual commit history. This question is somewhat simplified because I'm dealing with a repository that has no branches and no tags.
我有一个存储库,用于存储一些越来越大的大型二进制文件(tifs、jpgs、pdfs)。还有相当数量的文件被创建、删除和重命名,我不关心单个提交历史。这个问题有些简化,因为我正在处理一个没有分支和标签的存储库。
I'm curious if there's an easy way to remove some of the history from the system to save space.
我很好奇是否有一种简单的方法可以从系统中删除一些历史记录以节省空间。
I found an old thread on the git mailing listbut it doesn't really specify how to use this (i.e. what the $drop is):
我在 git 邮件列表上找到了一个旧线程,但它并没有真正指定如何使用它(即 $drop 是什么):
git filter-branch --parent-filter "sed -e 's/-p $drop//'" \
--tag-name-filter cat -- \
--all ^$drop
采纳答案by Tilman Vogel
I think, you can shrink your history following this answer:
我认为,您可以按照以下答案缩小您的历史记录:
How to delete a specific revision of a github gist?
Decide on which points in history, you want to keep.
确定要保留的历史记录点。
pick <hash1> <commit message>
pick <hash2> <commit message>
pick <hash3> <commit message> <- keep
pick <hash4> <commit message>
pick <hash5> <commit message>
pick <hash6> <commit message> <- keep
pick <hash7> <commit message>
pick <hash8> <commit message>
pick <hash9> <commit message>
pick <hash10> <commit message> <- keep
Then, leave the first after each "keep" as "pick" and mark the others as "squash".
然后,将每个“保留”后的第一个保留为“挑选”,并将其他标记为“壁球”。
pick <hash1> <commit message>
squash <hash2> <commit message>
squash <hash3> <commit message> <- keep
pick <hash4> <commit message>
squash <hash5> <commit message>
squash <hash6> <commit message> <- keep
pick <hash7> <commit message>
squash <hash8> <commit message>
squash <hash9> <commit message>
squash <hash10> <commit message> <- keep
Then, run the rebase by saving and quitting the editor. At each "keep" point, the message editor will pop up for a combined commit message ranging from the previous "pick" up to the "keep" commit. You can then either just keep the last message or in fact combine those to document the original history without keeping all intermediate states.
然后,通过保存并退出编辑器来运行 rebase。在每个“保持”点,消息编辑器将弹出一个组合提交消息,范围从之前的“选择”到“保持”提交。然后,您可以只保留最后一条消息,或者实际上将它们合并以记录原始历史记录,而无需保留所有中间状态。
After that rebase, the intermediate file data will still be in the repository but now unreferenced. git gc
will now indeed get you rid of that data.
在该 rebase 之后,中间文件数据仍将在存储库中,但现在未引用。git gc
现在确实会让你摆脱这些数据。
回答by ezod
You could always just delete .git
and do a fresh git init
with one initial commit. This will, of course, remove allcommit history.
您总是可以删除.git
并重新git init
提交一次初始提交。当然,这将删除所有提交历史记录。
回答by Iver
$drop is a variable (that you want to looking for)
$drop 是一个变量(您要查找的)
If you want to clean up unnecessary files and optimize the local repository you must check the command git gc
如果要清理不必要的文件并优化本地存储库,则必须检查命令git gc
And git pruneis another option because it removes objects that are no longer pointed to by any object in any reachable branch.
而git的修剪是另一种选择,因为它消除那些不再被任何物体在任何可到达的分支指向的对象。
I hope this could help you.
我希望这可以帮助你。
回答by kaezarrex
If you want to find and remove large files from your Git history, Pro Githas a section called Removing Objects, which guides you through this process. It's a bit complicated, but it would allow you to remove files from your history that you have deleted anyway, while keeping the rest of your history intact.
如果您想从 Git 历史记录中查找和删除大文件,Pro Git有一个名为Removing Objects的部分,它会指导您完成此过程。这有点复杂,但它允许您从历史记录中删除已删除的文件,同时保持其余历史记录完整无缺。
回答by nachoparker
It is a bit complicated to have git forget about a file.
让 git 忘记文件有点复杂。
git rm
will only remove the file on this branch from now on, but it remains in history and git will remember it.
git rm
从现在开始只会删除此分支上的文件,但它会保留在历史记录中并且 git 会记住它。
The right way to do it is with git filter-branch
, as others have mentioned here. It will rewrite every commit in the history of the branch to delete that file.
正确的方法是使用git filter-branch
,正如其他人在此处提到的那样。它将重写分支历史记录中的每个提交以删除该文件。
But, even after doing that, git can remember it because there can be references to it in reflog, remotes, tags and such.
但是,即使这样做之后,git 仍然可以记住它,因为可以在 reflog、remotes、tags 等中引用它。
I wrote a little utility called git forget-blob
我写了一个名为 git forget-blob
It is easy, just do git forget-blob file1.txt
.
很简单,做就行git forget-blob file1.txt
。
This will remove every reference, do git filter-branch
, and finally run the git garbage collector git gc
to completely get rid of this file in your repo.
这将删除每个引用, do git filter-branch
,最后运行 git 垃圾收集器git gc
以完全删除您的 repo 中的这个文件。