git 复制文件,而不是`git mv`

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/47401843/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 12:59:31  来源:igfitidea点击:

git copy file, as opposed to `git mv`

gitcpgit-mv

提问by Alexander Mills

I realize that git works by diff'ing the contents of files. I have some files that I want to copy. To absolutely prevent git from ever getting confused, is there some git command that can be used to copy the files to a different directory (not mv, but cp), and stage the files as well?

我意识到 git 通过区分文件的内容来工作。我有一些文件要复制。为了绝对防止 git 变得混乱,是否有一些 git 命令可用于将文件复制到不同的目录(不是 mv,而是 cp),并将文件暂存?

回答by torek

The short answer is just "no". But there is more to know; it just requires some background. (And as JDB suggests in a comment, I'll mention why git mvexists as a convenience.)

简短的回答是“不”。但还有更多需要了解;它只需要一些背景。(正如JDB 在评论中所建议的,我会提到为什么git mv存在是为了方便。)

Slightly longer: you're right that Git will diff files, but you may be wrong about whenGit does these file-diffs.

稍微长一点:你是对的,Git 会比较文件,但你可能错了Git什么时候做这些文件差异。

Git's internal storage model proposes that each commit is an independent snapshot of allthe files in that commit. The version of each file that goes into the new commit, i.e., the data in the snapshot for that path, is whatever is in the index under that path at the time you run git commit.1

Git 的内部存储模型建议每次提交都是该提交中所有文件的独立快照。进入新提交的每个文件的版本,即该路径的快照中的数据,是您运行时该路径下索引中的任何内容git commit1

The actual implementation, to the first level, is that each snapshotted-file is captured in compressed form as a blob objectin the Git database. The blob object is quite independent of every previous and subsequent version of that file, except for one special case: if you make a new commit in which nodata have changed, you will re-use the old blob. So when you make two commits in a row, each of which holds 100 files, and only one file is changed, the second commit re-uses 99 previous blobs, and need only snapshot one actual file into a new blob.2

第一级的实际实现是,每个快照文件都以压缩形式作为Git 数据库中的blob 对象捕获。blob 对象完全独立于该文件的每个先前和后续版本,除了一种特殊情况:如果您进行了没有数据更改的新提交,您将重新使用旧的 blob。因此,当您连续进行两次提交,每个提交包含 100 个文件,并且只有一个文件被更改时,第二次提交会重新使用之前的 99 个 blob,并且只需要将一个实际文件快照到一个新 blob 中。2

Hence the fact that Git will diff files doesn't enter into making commits at all. No commit depends on a previous commit, other than to store the previous commit's hash ID (and perhaps to re-use exactly-matching blobs, but that's a side effect of them exactly matching, rather than a fancy computation at the time you run git commit).

因此,Git 将差异文件这一事实根本不会参与提交。没有提交依赖于先前的提交,除了存储先前提交的哈希 ID(也许是为了重新使用完全匹配的 blob,但这是它们完全匹配的副作用,而不是在您运行时进行花哨的计算git commit) .

Now, all these independent blob objects do eventually take up an exorbitant amount of space. At this point, Git can "pack" objects into a .packfile. It will compare each object to some selected set of other objects—they may be earlier or later in history, and have the same file name or different file names, and in theory Git could even compress a commit object against a blob object or vice versa (though in practice it doesn't)—and try to find some way to represent many blobs using less disk space. But the result is still, at least logically, a series of independent objects, retrieved completely intact in their original form using their hash IDs. So even though the amount of disk space used goes down (we hope!) at this point, all of the objects are exactly the same as before.

现在,所有这些独立的 blob 对象最终都会占用大量空间。 此时,Git 可以将对象“打包”到一个.pack文件中。它将每个对象与一些选定的其他对象集进行比较——它们可能在历史上更早或更晚,并且具有相同的文件名或不同的文件名,理论上 Git 甚至可以针对 blob 对象压缩提交对象,反之亦然(尽管在实践中并非如此)——并尝试找到某种方法来使用更少的磁盘空间来表示许多 blob。但结果仍然是,至少在逻辑上,一系列独立的对象,使用它们的散列 ID 以原始形式完整地检索。因此,即使此时使用的磁盘空间量减少(我们希望!),所有对象都与以前完全相同。

So when doesGit compare files? The answer is: Only when you ask it to.The "ask time" is when you run git diff, either directly:

因此,当混帐比较文件?答案是:只有当你要求的时候。“询问时间”是您git diff直接运行的时间:

git diff commit1 commit2

or indirectly:

或间接:

git show commit  # roughly, `git diff commit^@ commmit`
git log -p       # runs `git show commit`, more or less, on each commit

There are a bunch of subtleties about this—in particular, git showwill produce what Git calls combined diffswhen run on merge commits, while git log -pnormally just skips right over the diffs for merge commits—but these, along with some other important cases, are when Git runs git diff.

这有很多微妙之处——特别是,当在合并提交上运行时,git show会产生 Git 调用的组合差异,而git log -p通常只是跳过合并提交的差异——但是这些,以及其他一些重要的情况,是当 Git运行git diff

It's when Git runs git diffthat you can (sometimes) ask it to find, or not to find, copies. The -Cflag, also spelled --find-copies=<number>, asks Git to find copies. The --find-copies-harderflag (which the Git documentation calls "computationally expensive") looks harder for copies than the plain -Cflag. The -B(break inappropriate pairings) option affects -C. The -Maka --find-renames=<number>option also affects -C. The git mergecommand can be told to adjust its level of rename detection, but—at least currently—cannot be told to find copies, nor break inappropriate pairings.

这是在Git的运行git diff,你可以(有时)问它来寻找,还是没有找到,拷贝。该-C标志,也说明--find-copies=<number>,要求Git的找份。该--find-copies-harder标志(其中Git的文档调用“计算昂贵”)看起来更难复印速度比普通的-C标志。该-B(休息不当配对)选项影响-C。该-M又名--find-renames=<number>选项还影响-C。该git merge命令可以告诉调整其重命名检测的水平,但是,至少目前,不能被告知找份,也没有打破不当配对。

(One command, git blame, does somewhat different copy-finding and the above does not entirely apply to it.)

(一个命令, git blame, 做一些不同的副本查找,上面的并不完全适用于它。)



1If you run git commit --include <paths>or git commit --only <paths>or git commit <paths>or git commit -a, think of these as modifying the index before running git commit. In the special case of --only, Git uses a temporary index, which is a little bit complicated, but it still commits from anindex—it just uses the special temporary one instead of the normal one. To make the temporary index, Git copies all the files from the HEADcommit, then overlays those with the --onlyfiles you listed. For the other cases, Git just copies the work-tree files into the regular index, then goes on to make the commit from the index as usual.

1如果您运行git commit --include <paths>orgit commit --only <paths>git commit <paths>or git commit -a,请将它们视为在运行之前修改索引git commit。在特殊情况下--only,Git使用一个临时的指数,这是一个有点复杂,但它仍然承诺从一个指数,它只是使用特殊的临时的,而不是正常的。为了创建临时索引,Git 从HEAD提交中复制所有文件,然后用--only您列出的文件覆盖这些文件。对于其他情况,Git 只是将工作树文件复制到常规索引中,然后像往常一样继续从索引进行提交。

2In fact, the actual snapshotting, storing the blob into the repository, happens during git add. This secretly makes git commitmuch faster, since you don't normally notice the extra time it takes to run git addbefore you fire up git commit.

2实际上,将 blob 存储到存储库中的实际快照发生在git add. 这秘密地使git commit速度更快,因为您通常不会注意到在启动git add之前运行所需的额外时间git commit



Why git mvexists

为什么git mv存在

What git mv old newdoes is, veryroughly:

什么git mv old new是,非常粗略地:

mv old new
git add new
git add old

The first step is obvious enough: we need to rename the work-tree version of the file. The second step is similar: we need to put the index version of the file into place. The third, though, is weird:why should we "add" a file we just removed? Well, git adddoesn't always add a file: instead, in this case it detects that the file wasin the index and isn't anymore.

第一步很明显:我们需要重命名文件的工作树版本。第二步类似:我们需要将文件的索引版本放置到位。但是,第三个很奇怪:为什么我们要“添加”我们刚刚删除的文件?好了,git add并不总是能够增加一个文件:相反,在这种情况下,检测到该文件在指数没有了。

We could also spell that third step as:

我们也可以将第三步拼写为:

git rm --cached old

All we're really doing is taking the old name out of the index.

我们真正要做的就是从索引中删除旧名称。

But there's an issue here, which is why I said "veryroughly". The index has a copy of each file that will be committed the next time you run git commit. That copy might not match the one in the work-tree.In fact, it might not even match the one in HEAD, if there is one in HEADat all.

但是这里有一个问题,这就是为什么我说“非常粗略”的原因。该索引具有每个文件的副本,下次运行时将提交该副本git commit该副本可能与工作树中的副本不匹配。事实上,它甚至可能与 中的 不匹配HEAD,如果有一个 inHEAD的话。

For instance, after:

例如,之后:

echo I am a foo > foo
git add foo

the file fooexists in the work-tree and in the index. The work-tree contents and the index contents match. But now let's change the work-tree version:

该文件foo存在于工作树和索引中。工作树内容和索引内容匹配。但是现在让我们更改工作树版本:

echo I am a bar > foo

Now the index and work-tree differ. Suppose we want to move the underlying file from footo bar, but—for some strange reason3—we want to keep the index contents unchanged. If we run:

现在索引和工作树不同了。假设我们要对底层文件从移动foobar,但是,对于一些奇怪的原因3-我们要保持索引内容不变。如果我们运行:

mv foo bar
git add bar

we'll get I am a barinside the new index file. If we then remove the old version of foofrom the index, we lose the I am a fooversion entirely.

我们将进入I am a bar新的索引文件。如果我们随后foo从索引中删除旧版本,我们将I am a foo完全丢失该版本。

So, git mv foo bardoesn't really move-and-add-twice, or move-add-and-remove. Instead, it renames the work-tree file andrenames the in-index copy. If the index copy of the original file differs from the work-tree file, the renamed index copy still differs from the renamed work-tree copy.

所以,git mv foo bar并没有真正移动和添加两次,或移动添加和删除。相反,它重命名工作树的文件重命名的索引拷贝。如果原始文件的索引副本与工作树文件不同,则重命名的索引副本仍与重命名的工作树副本不同。

It's very difficult to do this without a front end command like git mv.4Of course, if you plan to git addeverything, you don't need all of this stuff in the first place. And, it's worth noting that if git cpexisted, it probably should alsocopy the index version, not the work-tree version, when making the index copy. So git cpreally should exist. There also should be a git mv --afteroption, a la Mercurial's hg mv --after. Both shouldexist, but currently don't. (There's less call for either of these, though, than there is for straight git mv, in my opinion.)

如果没有像git mv. 4当然,如果你计划git add一切,你就不需要所有这些东西。而且,值得注意的是,如果git cp存在,在制作索引副本时,它可能应该复制索引版本,而不是工作树版本。所以git cp真的应该存在。还应该有一个git mv --after选项,一个 la Mercurial 的hg mv --after. 两者都应该存在,但目前不存在。(不过git mv,在我看来,对其中任何一个的要求都比对直接的要求要少。)



3For this example, it's kind of silly and pointless. But if you use git add -pto carefully prepare a patch for an intermediate commit, and then decide that along with the patch, you would like to rename the file, it's definitely handy to be able to do that without messing up your carefully-patched-together intermediate version.

3对于这个例子,它有点愚蠢和毫无意义。但是,如果您git add -p习惯为中间提交仔细准备补丁,然后决定与补丁一起重命名文件,那么能够做到这一点绝对方便,而不会弄乱您精心打补丁的中间版本。

4It's not impossible: git ls-index --stagewill get you the information you need from the index as it is right now, and git update-indexallows you to make arbitrary changes to the index. You can combine these two, and some complex shell scripting or programming in a nicer language, to build something that implements git mv --afterand git cp.

4这并非不可能:git ls-index --stage将从索引中获取您现在需要的信息,并git update-index允许您对索引进行任意更改。您可以将这两者以及一些复杂的 shell 脚本或更好的语言编程结合起来,以构建实现git mv --aftergit cp.