删除由 git 创建的大 .pack 文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11050265/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Remove large .pack file created by git
提问by user1116573
I checked a load of files in to a branch and merged and then had to remove them and now I'm left with a large .pack file that I don't know how to get rid of.
我将大量文件签入一个分支并合并,然后不得不删除它们,现在我留下了一个大的 .pack 文件,我不知道如何摆脱。
I deleted all the files using git rm -rf xxxxxx
and I also ran the --cached
option as well.
我删除了所有使用的文件git rm -rf xxxxxx
,我也运行了该--cached
选项。
Can someone tell me how I can remove a large .pack file that is currently in the following directory:
有人可以告诉我如何删除当前位于以下目录中的大型 .pack 文件:
.git/objects/pack/pack-xxxxxxxxxxxxxxxxx.pack
.git/objects/pack/pack-xxxxxxxxxxxxxxxxx.pack
Do I just need to remove the branch that I still have but am no longer using? Or is there something else I need to run?
我是否只需要删除我仍然拥有但不再使用的分支?或者还有什么我需要运行的吗?
I'm not sure how much difference it makes but it shows a padlock against the file.
我不确定它有多大区别,但它显示了对文件的挂锁。
Thanks
谢谢
EDIT
编辑
Here are some excerpts from my bash_history that should give an idea how I managed to get into this state (assume at this point I'm working on a git branch called 'my-branch' and I've got a folder containing more folders/files):
以下是我的 bash_history 的一些摘录,它们应该可以说明我是如何设法进入这种状态的(假设此时我正在一个名为“my-branch”的 git 分支上工作,并且我有一个包含更多文件夹的文件夹/文件):
git add .
git commit -m "Adding my branch changes to master"
git checkout master
git merge my-branch
git rm -rf unwanted_folder/
rm -rf unwanted_folder/ (not sure why I ran this as well but I did)
I thought I also ran the following but it doesn't appear in the bash_history with the others :
我以为我也运行了以下内容,但它没有与其他人一起出现在 bash_history 中:
git rm -rf --cached unwanted_folder/
I also thought I ran some git commands (like git gc
) to try to tidy up the pack file but they don't appear in the .bash_history file either.
我还以为我运行了一些 git 命令(如git gc
)来尝试整理包文件,但它们也没有出现在 .bash_history 文件中。
回答by loganfsmyth
The issue is that, even though you removed the files, they are still present in previous revisions. That's the whole point of git, is that even if you delete something, you can still get it back by accessing the history.
问题是,即使您删除了这些文件,它们仍然存在于以前的修订版中。这就是 git 的全部意义所在,即使您删除了某些内容,您仍然可以通过访问历史记录来取回它。
What you are looking to do is called rewriting history, and it involved the git filter-branch
command.
你要做的就是重写历史,它涉及到git filter-branch
命令。
GitHub has a good explanation of the issue on their site. https://help.github.com/articles/remove-sensitive-data
GitHub 在他们的网站上对这个问题有很好的解释。https://help.github.com/articles/remove-sensitive-data
To answer your question more directly, what you basically need to run is this command with unwanted_filename_or_folder
replaced accordingly:
为了更直接地回答您的问题,您基本上需要运行的是此命令并相应地unwanted_filename_or_folder
替换为:
git filter-branch --index-filter 'git rm -r --cached --ignore-unmatch unwanted_filename_or_folder' --prune-empty
This will remove all references to the files from the active history of the repo.
这将从存储库的活动历史记录中删除对文件的所有引用。
Next step, to perform a GC cycle to force all references to the file to be expired and purged from the packfile. Nothing needs to be replaced in these commands.
下一步,执行 GC 循环以强制对文件的所有引用过期并从包文件中清除。这些命令中不需要替换任何内容。
git for-each-ref --format='delete %(refname)' refs/original | git update-ref --stdin
# or, for older git versions (e.g. 1.8.3.1) which don't support --stdin
# git update-ref $(git for-each-ref --format='delete %(refname)' refs/original)
git reflog expire --expire=now --all
git gc --aggressive --prune=now
回答by onlynone
Scenario A: If your large files were only added to a branch, you don't need to run git filter-branch
. You just need to delete the branch and run garbage collection:
场景 A:如果你的大文件只被添加到一个分支,你不需要运行git filter-branch
. 您只需要删除分支并运行垃圾收集:
git branch -D mybranch
git reflog expire --expire-unreachable=all --all
git gc --prune=all
Scenario B: However, it looks like based on your bash history, that you did merge the changes into master. If you haven't shared the changes with anyone (no git push
yet). The easiest thing would be to reset master back to before the merge with the branch that had the big files. This will eliminate all commits from your branch and all commits made to master after the merge. So you might lose changes -- in addition to the big files -- that you may have actually wanted:
场景 B:但是,根据您的 bash 历史记录,您确实将更改合并到 master 中。如果您git push
尚未与任何人共享更改(还没有)。最简单的方法是将 master 重置回与具有大文件的分支合并之前。这将消除分支中的所有提交以及合并后对 master 的所有提交。因此,除了大文件之外,您可能会丢失您可能真正想要的更改:
git checkout master
git log # Find the commit hash just before the merge
git reset --hard <commit hash>
Then run the steps from the scenario A.
然后运行场景 A 中的步骤。
Scenario C: If there were other changes from the branch orchanges on master after the merge that you want to keep, it would be best to rebase master and selectively include commits that you want:
场景 C:如果您想保留合并后分支的其他更改或master 上的更改,最好重新设置 master 并有选择地包含您想要的提交:
git checkout master
git log # Find the commit hash just before the merge
git rebase -i <commit hash>
In your editor, remove lines that correspond to the commits that added the large files, but leave everything else as is. Save and quit. Your master branch should only contain what you want, and no large files. Note that git rebase
without -p
will eliminate merge commits, so you'll be left with a linear history for master after <commit hash>
. This is probably okay for you, but if not, you could try with -p
, but git help rebase
says combining -p with the -i option explicitly is generally not a good idea unless you know what you are doing
.
在您的编辑器中,删除与添加大文件的提交相对应的行,但保留其他所有内容。保存并退出。你的 master 分支应该只包含你想要的东西,而不是大文件。请注意,git rebase
without-p
将消除合并提交,因此您将在<commit hash>
. 这对您来说可能没问题,但如果不是,您可以尝试使用-p
,但是git help rebase
说combining -p with the -i option explicitly is generally not a good idea unless you know what you are doing
。
Then run the commands from scenario A.
然后运行场景 A 中的命令。
回答by Timo
As loganfsmyth already stated in his answer, you need to purge git history because the files continue to exist there even after deleting them from the repo. Official GitHub docs recommend BFGwhich I find easier to use than filter-branch
:
正如 loganfsmyth 在他的回答中已经指出的那样,您需要清除 git 历史记录,因为即使从 repo 中删除文件后,这些文件仍然存在。官方 GitHub 文档推荐 BFG,我发现它比filter-branch
以下更易于使用:
Deleting files from history
从历史记录中删除文件
DownloadBFG from their website. Make sure you have java installed, then create a mirror clone and purge history. Make sure to replace YOUR_FILE_NAME
with the name of the file you'd like to delete:
从他们的网站下载BFG。确保您已安装 java,然后创建镜像克隆并清除历史记录。确保替换YOUR_FILE_NAME
为您要删除的文件的名称:
git clone --mirror git://example.com/some-big-repo.git
java -jar bfg.jar --delete-files YOUR_FILE_NAME some-big-repo.git
cd some-big-repo.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
git push
Delete a folder
删除文件夹
Same as above but use --delete-folders
与上面相同但使用 --delete-folders
java -jar bfg.jar --delete-folders YOUR_FOLDER_NAME some-big-repo.git
Other options
其他选项
BFG also allows for even fancier options (see docs) like these:
BFG 还允许更高级的选项(参见文档),如下所示:
Remove all files bigger than 100M from history:
从历史记录中删除所有大于 100M 的文件:
java -jar bfg.jar --strip-blobs-bigger-than 100M some-big-repo.git
Important!
重要的!
When running BFG, be careful that both YOUR_FILE_NAME
and YOUR_FOLDER_NAME
are indeed just file/folder names. They're not paths, so something like foo/bar.jpg
will not work! Instead all files/folders with the specified name will be removed from repo history, no matter which path or branch they existed.
运行 BFG 时,请注意YOUR_FILE_NAME
和YOUR_FOLDER_NAME
确实只是文件/文件夹名称。它们不是路径,所以像这样的东西是foo/bar.jpg
行不通的!相反,所有具有指定名称的文件/文件夹都将从仓库历史记录中删除,无论它们存在于哪个路径或分支。
回答by Michael Durrant
One option:
一种选择:
run git gc
manually to condense a number of pack files into one or a few pack files.
This operation is persistent (i.e. the large pack file will retain its compression behavior) so it may be beneficial to compress a repository periodically with git gc --aggressive
git gc
手动运行将多个打包文件压缩为一个或几个打包文件。此操作是持久的(即大包文件将保留其压缩行为),因此使用以下命令定期压缩存储库可能会有所帮助git gc --aggressive
Another option is to save the code and .git somewhere and then delete the .git and start again using this existing code, creating a new git repository (git init
).
另一种选择是将代码和 .git 保存在某处,然后删除 .git 并使用此现有代码重新开始,创建一个新的 git 存储库 ( git init
)。
回答by Benjamin Wasula
Run the following command, replacing PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA
with the path to the file you want to remove, not just its filename. These arguments will:
运行以下命令,替换PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA
为要删除的文件的路径,而不仅仅是其文件名。这些论点将:
- Force Git to process, but not check out, the entire history of every branch and tag
- Remove the specified file, as well as any empty commits generated as a result
- Overwrite your existing tags
- 强制 Git 处理但不检出每个分支和标签的整个历史记录
- 删除指定的文件,以及由此产生的任何空提交
- 覆盖现有标签
git filter-branch --force --index-filter "git rm --cached --ignore-unmatch PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA" --prune-empty --tag-name-filter cat -- --all
This will forcefully remove all references to the files from the active history of the repo.
这将从存储库的活动历史记录中强制删除对文件的所有引用。
Next step, to perform a GC cycle to force all references to the file to be expired and purged from the pack file. Nothing needs to be replaced in these commands.
下一步,执行 GC 循环以强制对文件的所有引用过期并从包文件中清除。这些命令中不需要替换任何内容。
git update-ref -d refs/original/refs/remotes/origin/master
git for-each-ref --format='delete %(refname)' refs/original | git update-ref --stdin
git reflog expire --expire=now --all
git gc --aggressive --prune=now
回答by Rishabh Kumar
I am a little late for the show but in case the above answer didn't solve the query then I found another way. Simply remove the specific large file from .pack. I had this issue where I checked in a large 2GB file accidentally. I followed the steps explained in this link: http://www.ducea.com/2012/02/07/howto-completely-remove-a-file-from-git-history/
我的节目有点晚了,但如果上面的答案没有解决查询,那么我找到了另一种方法。只需从 .pack 中删除特定的大文件。我遇到了这个问题,我不小心签入了一个 2GB 的大文件。我按照此链接中说明的步骤操作:http: //www.ducea.com/2012/02/07/howto-completely-remove-a-file-from-git-history/
回答by shreya10
this is more of a handy solution than a coding one. zip the file. Open the zip in file view format (different from unzipping). Delete the .pack file. Unzip and replace the folder. Works like a charm!
这比编码更方便。压缩文件。以文件视图格式打开 zip(不同于解压缩)。删除 .pack 文件。解压并替换文件夹。奇迹般有效!