如何恢复因硬盘故障损坏的 Git 对象?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/801577/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 06:23:06  来源:igfitidea点击:

How to recover Git objects damaged by hard disk failure?

gitcorruptiondata-recovery

提问by Christian

I have had a hard disk failure which resulted in some files of a Git repository getting damaged. When running git fsck --fullI get the following output:

我遇到了硬盘故障,导致 Git 存储库的某些文件损坏。运行时,git fsck --full我得到以下输出:

error: .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack SHA1 checksum mismatch
error: index CRC mismatch for object 6c8cae4994b5ec7891ccb1527d30634997a978ee from .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack at offset 97824129
error: inflate: data stream error (invalid code lengths set)
error: cannot unpack 6c8cae4994b5ec7891ccb1527d30634997a978ee from .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack at offset 97824129
error: inflate: data stream error (invalid stored block lengths)
error: failed to read object 0dcf6723cc69cc7f91d4a7432d0f1a1f05e77eaa at offset 276988017 from .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack
fatal: object 0dcf6723cc69cc7f91d4a7432d0f1a1f05e77eaa is corrupted

I have backups of the repository, but the only backup that includes the pack file has it already damaged. So I think that I have to find out a way to retrieve the single objects from different backups and somehow instruct Git to produce a new pack with only correct objects.

我有存储库的备份,但唯一包含包文件的备份已经损坏。所以我认为我必须找到一种方法来从不同的备份中检索单个对象,并以某种方式指示 Git 生成一个仅包含正确对象的新包。

Can you please give me hints how to fix my repository?

你能给我提示如何修复我的存储库吗?

采纳答案by Daniel Fanjul

In some previous backups, your bad objects may have been packed in different files or may be loose objects yet. So your objects may be recovered.

在以前的一些备份中,您的坏对象可能已打包在不同的文件中,或者可能是松散的对象。所以你的对象可能会被恢复。

It seems there are a few bad objects in your database. So you could do it the manual way.

您的数据库中似乎有一些坏对象。所以你可以通过手动方式来完成。

Because of git hash-object, git mktreeand git commit-treedo not write the objects because they are found in the pack, then start doing this:

因为git hash-objectgit mktree并且git commit-tree不要写对象,因为它们是在包中找到的,然后开始这样做:

mv .git/objects/pack/* <somewhere>
for i in <somewhere>/*.pack; do
  git unpack-objects -r < $i
done
rm <somewhere>/*

(Your packs are moved out from the repository, and unpacked again in it; only the good objects are now in the database)

(您的包已从存储库中移出,并再次在其中解压缩;现在只有好的对象在数据库中)

You can do:

你可以做:

git cat-file -t 6c8cae4994b5ec7891ccb1527d30634997a978ee

and check the type of the object.

并检查对象的类型。

If the type is blob: retrieve the contents of the file from previous backups (with git showor git cat-fileor git unpack-file; then you may git hash-object -wto rewrite the object in your current repository.

如果类型是 blob:从以前的备份中检索文件的内容(使用git showgit cat-filegit unpack-file; 那么您可以git hash-object -w重写当前存储库中的对象。

If the type is tree: you could use git ls-treeto recover the tree from previous backups; then git mktreeto write it again in your current repository.

如果类型是树:您可以用来git ls-tree从以前的备份中恢复树;然后git mktree在您当前的存储库中再次编写它。

If the type is commit: the same with git show, git cat-fileand git commit-tree.

如果类型是 commit:与git show,git cat-file和相同git commit-tree

Of course, I would backup your original working copy before starting this process.

当然,我会在开始此过程之前备份您的原始工作副本。

Also, take a look at How to Recover Corrupted Blob Object.

另外,看看如何恢复损坏的 Blob 对象

回答by Christian

Banenguskwas putting me on the right track. For further reference, I want to post the steps I took to fix my repository corruption. I was lucky enough to find all needed objects either in older packs or in repository backups.

Banengusk让我走上正轨。为了进一步参考,我想发布我为修复存储库损坏而采取的步骤。我很幸运能在旧包或存储库备份中找到所有需要的对象。

# Unpack last non-corrupted pack
$ mv .git/objects/pack .git/objects/pack.old
$ git unpack-objects -r < .git/objects/pack.old/pack-012066c998b2d171913aeb5bf0719fd4655fa7d0.pack
$ git log
fatal: bad object HEAD

$ cat .git/HEAD 
ref: refs/heads/master

$ ls .git/refs/heads/

$ cat .git/packed-refs 
# pack-refs with: peeled 
aa268a069add6d71e162c4e2455c1b690079c8c1 refs/heads/master

$ git fsck --full 
error: HEAD: invalid sha1 pointer aa268a069add6d71e162c4e2455c1b690079c8c1
error: refs/heads/master does not point to a valid object!
missing blob 75405ef0e6f66e48c1ff836786ff110efa33a919
missing blob 27c4611ffbc3c32712a395910a96052a3de67c9b
dangling tree 30473f109d87f4bcde612a2b9a204c3e322cb0dc

# Copy HEAD object from backup of repository
$ cp repobackup/.git/objects/aa/268a069add6d71e162c4e2455c1b690079c8c1 .git/objects/aa
# Now copy all missing objects from backup of repository and run "git fsck --full" afterwards
# Repeat until git fsck --full only reports dangling objects

# Now garbage collect repo
$ git gc
warning: reflog of 'HEAD' references pruned commits
warning: reflog of 'refs/heads/master' references pruned commits
Counting objects: 3992, done.
Delta compression using 2 threads.
fatal: object bf1c4953c0ea4a045bf0975a916b53d247e7ca94 inconsistent object length (6093 vs 415232)
error: failed to run repack

# Check reflogs...
$ git reflog

# ...then clean
$ git reflog expire --expire=0 --all

# Now garbage collect again
$ git gc       
Counting objects: 3992, done.
Delta compression using 2 threads.
Compressing objects: 100% (3970/3970), done.
Writing objects: 100% (3992/3992), done.
Total 3992 (delta 2060), reused 0 (delta 0)
Removing duplicate objects: 100% (256/256), done.
# Done!

回答by kenorb

Try the following commands at first (re-run again if needed):

首先尝试以下命令(如果需要,重新运行):

$ git fsck --full
$ git gc
$ git gc --prune=today
$ git fetch --all
$ git pull --rebase

And then you you still have the problems, try can:

然后你仍然有问题,试试可以:

  • remove all the corrupt objects, e.g.

    fatal: loose object 91c5...51e5 (stored in .git/objects/06/91c5...51e5) is corrupt
    $ rm -v .git/objects/06/91c5...51e5
    
  • remove all the empty objects, e.g.

    error: object file .git/objects/06/91c5...51e5 is empty
    $ find .git/objects/ -size 0 -exec rm -vf "{}" \;
    
  • check a "broken link" message by:

    git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
    

    This will tells you what file the corrupt blob came from!

  • to recover file, you might be really lucky, and it may be the version that you already have checked out in your working tree:

    git hash-object -w my-magic-file
    

    again, and if it outputs the missing SHA1 (4b945..) you're now all done!

  • assuming that it was some older version that was broken, the easiest way to do it is to do:

    git log --raw --all --full-history -- subdirectory/my-magic-file
    

    and that will show you the whole log for that file (please realize that the tree you had may not be the top-level tree, so you need to figure out which subdirectory it was in on your own), then you can now recreate the missing object with hash-object again.

  • to get a list of all refs with missing commits, trees or blobs:

    $ git for-each-ref --format='%(refname)' | while read ref; do git rev-list --objects $ref >/dev/null || echo "in $ref"; done
    

    It may not be possible to remove some of those refs using the regular branch -d or tag -d commands, since they will die if git notices the corruption. So use the plumbing command git update-ref -d $ref instead. Note that in case of local branches, this command may leave stale branch configuration behind in .git/config. It can be deleted manually (look for the [branch "$ref"] section).

  • After all refs are clean, there may still be broken commits in the reflog. You can clear all reflogs using git reflog expire --expire=now --all. If you do not want to lose all of your reflogs, you can search the individual refs for broken reflogs:

    $ (echo HEAD; git for-each-ref --format='%(refname)') | while read ref; do git rev-list -g --objects $ref >/dev/null || echo "in $ref"; done
    

    (Note the added -g option to git rev-list.) Then, use git reflog expire --expire=now $ref on each of those. When all broken refs and reflogs are gone, run git fsck --full in order to check that the repository is clean. Dangling objects are Ok.

  • 删除所有损坏的对象,例如

    fatal: loose object 91c5...51e5 (stored in .git/objects/06/91c5...51e5) is corrupt
    $ rm -v .git/objects/06/91c5...51e5
    
  • 删除所有空对象,例如

    error: object file .git/objects/06/91c5...51e5 is empty
    $ find .git/objects/ -size 0 -exec rm -vf "{}" \;
    
  • 通过以下方式检查“断开的链接”消息:

    git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
    

    这将告诉您损坏的 blob 来自哪个文件!

  • 要恢复文件,您可能真的很幸运,它可能是您已经在工作树中检出的版本:

    git hash-object -w my-magic-file
    

    再次,如果它输出丢失的 SHA1 (4b945 ..) 你现在就完成了!

  • 假设它是一些旧版本损坏,最简单的方法是执行以下操作:

    git log --raw --all --full-history -- subdirectory/my-magic-file
    

    这将向您显示该文件的整个日志(请意识到您拥有的树可能不是顶级树,因此您需要自己确定它位于哪个子目录中),然后您现在可以重新创建再次丢失带有散列对象的对象。

  • 获取缺少提交、树或 blob 的所有引用的列表:

    $ git for-each-ref --format='%(refname)' | while read ref; do git rev-list --objects $ref >/dev/null || echo "in $ref"; done
    

    使用常规的 branch -d 或 tag -d 命令可能无法删除其中一些引用,因为如果 git 注意到损坏,它们就会死亡。因此,请改用管道命令 git update-ref -d $ref。请注意,在本地分支的情况下,此命令可能会在 .git/config 中留下陈旧的分支配置。它可以手动删除(查找 [branch "$ref"] 部分)。

  • 在所有 refs 都是干净的之后,reflog 中可能仍然有损坏的提交。您可以使用 git reflog expire --expire=now --all 清除所有引用日志。如果您不想丢失所有的 reflog,可以在各个 refs 中搜索损坏的 reflog:

    $ (echo HEAD; git for-each-ref --format='%(refname)') | while read ref; do git rev-list -g --objects $ref >/dev/null || echo "in $ref"; done
    

    (注意 git rev-list 添加的 -g 选项。)然后,对每个选项使用 git reflog expire --expire=now $ref 。当所有损坏的 refs 和 reflogs 都消失后,运行 git fsck --full 以检查存储库是否干净。悬空物体还可以。



Below you can find advanced usage of commands which potentially can cause lost of your data in your git repository if not used wisely, so make a backup before you accidentally do further damages to your git. Try on your own risk if you know what you're doing.

您可以在下面找到命令的高级用法,如果使用不当,这些命令可能会导致您的 git 存储库中的数据丢失,因此请在不小心对您的 git 造成进一步损坏之前进行备份。如果您知道自己在做什么,请自行承担风险。



To pull the current branch on top of the upstream branch after fetching:

获取后将当前分支拉到上游分支的顶部:

$ git pull --rebase

You also may try to checkout new branch and delete the old one:

您也可以尝试签出新分支并删除旧分支:

$ git checkout -b new_master origin/master


To find the corrupted object in git for removal, try the following command:

要在 git 中找到要删除的损坏对象,请尝试以下命令:

while [ true ]; do f=`git fsck --full 2>&1|awk '{print }'|sed -r 's/(^..)(.*)/objects\/\//'`; if [ ! -f "$f" ]; then break; fi; echo delete $f; rm -f "$f"; done

For OSX, use sed -Einstead of sed -r.

对于 OSX,使用sed -E代替sed -r.



Other idea is to unpack all objects from pack files to regenerate all objects inside .git/objects, so try to run the following commands within your repository:

另一个想法是从打包文件中解压所有对象以重新生成 .git/objects 中的所有对象,因此尝试在您的存储库中运行以下命令:

$ cp -fr .git/objects/pack .git/objects/pack.bak
$ for i in .git/objects/pack.bak/*.pack; do git unpack-objects -r < $i; done
$ rm -frv .git/objects/pack.bak


If above doesn't help, you may try to rsync or copy the git objects from another repo, e.g.

如果以上没有帮助,您可以尝试 rsync 或从另一个 repo 复制 git 对象,例如

$ rsync -varu git_server:/path/to/git/.git local_git_repo/
$ rsync -varu /local/path/to/other-working/git/.git local_git_repo/
$ cp -frv ../other_repo/.git/objects .git/objects


To fix the broken branch when trying to checkout as follows:

在尝试结帐时修复损坏的分支,如下所示:

$ git checkout -f master
fatal: unable to read tree 5ace24d474a9535ddd5e6a6c6a1ef480aecf2625

Try to remove it and checkout from upstream again:

尝试将其删除并再次从上游结帐:

$ git branch -D master
$ git checkout -b master github/master

In case if git get you into detached state, checkout the masterand merge into it the detached branch.

如果 git 使您进入分离状态,请检查master并合并分离的分支。



Another idea is to rebase the existing master recursively:

另一个想法是递归地rebase现有的master:

$ git reset HEAD --hard
$ git rebase -s recursive -X theirs origin/master


See also:

也可以看看:

回答by Jonathan Maim

Here are the steps I followed to recover from a corrupt blob object.

以下是我从损坏的 blob 对象中恢复所遵循的步骤。

1) Identify corrupt blob

1) 识别损坏的 blob

git fsck --full
  error: inflate: data stream error (incorrect data check)
  error: sha1 mismatch 241091723c324aed77b2d35f97a05e856b319efd
  error: 241091723c324aed77b2d35f97a05e856b319efd: object corrupt or missing
  ...

Corrupt blob is 241091723c324aed77b2d35f97a05e856b319efd

损坏的 blob 是241091723c324aed77b2d35f97a05e856b319efd

2) Move corrupt blob to a safe place (just in case)

2) 将损坏的 blob 移动到安全的地方(以防万一)

mv .git/objects/24/1091723c324aed77b2d35f97a05e856b319efd ../24/

3) Get parent of corrupt blob

3) 获取损坏的 blob 的父级

git fsck --full
  Checking object directories: 100% (256/256), done.
  Checking objects: 100% (70321/70321), done.
  broken link from    tree 0716831e1a6c8d3e6b2b541d21c4748cc0ce7180
              to    blob 241091723c324aed77b2d35f97a05e856b319efd

Parent hash is 0716831e1a6c8d3e6b2b541d21c4748cc0ce7180.

父哈希是0716831e1a6c8d3e6b2b541d21c4748cc0ce7180

4) Get file name corresponding to corrupt blob

4) 获取损坏的blob对应的文件名

git ls-tree 0716831e1a6c8d3e6b2b541d21c4748cc0ce7180
  ...
  100644 blob 241091723c324aed77b2d35f97a05e856b319efd    dump.tar.gz
  ...

Find this particular file in a backup or in the upstream git repository (in my case it is dump.tar.gz). Then copy it somewhere inside your local repository.

在备份或上游 git 存储库中找到这个特定文件(在我的例子中是dump.tar.gz)。然后将其复制到本地存储库中的某个位置。

5) Add previously corrupted file in the git object database

5)在git对象数据库中添加之前损坏的文件

git hash-object -w dump.tar.gz

6) Celebrate!

6)庆祝!

git gc
  Counting objects: 75197, done.
  Compressing objects: 100% (21805/21805), done.
  Writing objects: 100% (75197/75197), done.
  Total 75197 (delta 52999), reused 69857 (delta 49296)

回答by go2null

Here are two functions that may help if your backup is corrupted, or you have a few partially corrupted backups as well (this may happen if you backup the corrupted objects).

如果您的备份已损坏,或者您有一些部分损坏的备份(如果您备份损坏的对象,可能会发生这种情况),这里有两个功能可能会有所帮助。

Run both in the repo you're trying to recover.

在您尝试恢复的存储库中运行两者。

Standard warning: only use if you're really desperate and you have backed up your (corrupted) repo. This might not resolve anything, but at least should highlight the level of corruption.

标准警告:仅当您真的很绝望并且备份了(损坏的)存储库时才使用。这可能无法解决任何问题,但至少应该突出腐败程度。

fsck_rm_corrupted() {
    corrupted='a'
    while [ "$corrupted" ]; do
        corrupted=$(                                  \
        git fsck --full --no-dangling 2>&1 >/dev/null \
            | grep 'stored in'                          \
            | sed -r 's:.*(\.git/.*)\).*::'           \
        )
        echo "$corrupted"
        rm -f "$corrupted"
    done
}

if [ -z "" ]  || [ ! -d "" ]; then
    echo "'' is not a directory. Please provide the directory of the git repo"
    exit 1
fi

pushd "" >/dev/null
fsck_rm_corrupted
popd >/dev/null

and

unpack_rm_corrupted() {
    corrupted='a'
    while [ "$corrupted" ]; do
        corrupted=$(                                  \
        git unpack-objects -r < "" 2>&1 >/dev/null \
            | grep 'stored in'                          \
            | sed -r 's:.*(\.git/.*)\).*::'           \
        )
        echo "$corrupted"
        rm -f "$corrupted"
    done
}

if [ -z "" ]  || [ ! -d "" ]; then
    echo "'' is not a directory. Please provide the directory of the git repo"
    exit 1
fi

for p in /objects/pack/pack-*.pack; do
    echo "$p"
    unpack_rm_corrupted "$p"
done

回答by Tim Lin

Git checkout can actually pick out individual files from a revision. Just give it the commit hash and the file name. More detailed info here.

Git checkout 实际上可以从修订中挑选出单个文件。只需给它提交哈希和文件名。更详细的信息在这里。

I guess the easiest way to fix this safely is to revert to the newest uncommited backup and then selectively pick out uncorrupted files from newer commits. Good luck!

我想安全解决这个问题的最简单方法是恢复到最新的未提交备份,然后有选择地从较新的提交中挑选出未损坏的文件。祝你好运!

回答by Dmitriy S

I have resolved this problem to add some change like git add -A and git commit again.

我已经解决了这个问题,以添加一些更改,例如 git add -A 和 git commit 再次。