使用 Git 和 Mercurial 进行部分克隆

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2586824/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 08:12:03  来源:igfitidea点击:

Partial clone with Git and Mercurial

gitmercurial

提问by pablo

Is it possible to clone only one branch (or from a given commit) in Git and Mercurial? I mean, I want to clone a central repo but since it's huge I'd like to only get part of it and still be able to contribute back my changes. Is it possible? Like, I only want from Tag 130 onwards or something like that?

是否可以在 Git 和 Mercurial 中只克隆一个分支(或从给定的提交)?我的意思是,我想克隆一个中央存储库,但由于它很大,我只想获取其中的一部分,并且仍然能够回馈我的更改。是否可以?比如,我只想要标签 130 或类似的东西?

If so, how?

如果是这样,如何?

回答by Jakub Nar?bski

In Git land you are talking about three different types of partial clones:

在 Git land 中,您谈论的是三种不同类型的部分克隆:

  • shallow clones:I want history from revision point X onward.

    Use git clone --depth <n> <url>for that, but please remember that shallow clones are somewhat limited in interacting with other repositories. You would be able to generate patches and send them via email.

  • partial clone by filepath:I want all revision history history in some directory /path.

    Not possiblein Git. With modern Git though you can have sparse checkout, i.e. you have whole history but you check out (have in working area) only subset of all files.

  • cloning only selected branch:I want to clone only one branch (or selected subset of branches).

    Possible, and

    before git 1.7.10 not simple: you would need to do what clone does manually, i.e. git init [<directory>], then git remote add origin <url>, edit .git/configreplacing *in remote.origin.fetchby requested branch (probably 'master'), then git fetch.

    as of git 1.7.10git cloneoffers the --single-branchoption which seems like it was added just for this purpose, and seems pretty easy.

    Note however that because branches usually share most of their history, the gain from cloning only a subset of branches might be smaller than you think.

  • 浅克隆:我想要从修订点 X 开始的历史。

    使用git clone --depth <n> <url>了这一点,但请记住浅克隆与其他信息库交互比较有限。您将能够生成补丁并通过电子邮件发送它们。

  • 按文件路径部分克隆:我想要某个目录中的所有修订历史记录/path

    在 Git 中不可能。使用现代 Git,尽管您可以进行稀疏检出,即您拥有完整的历史记录,但您仅检出(在工作区中)所有文件的子集。

  • 仅克隆选定的分支:我只想克隆一个分支(或选定的分支子集)。

    可能,并且

    以前的git 1.7.10并不简单:你需要做什么克隆手动做,即git init [<directory>],然后git remote add origin <url>,编辑.git/config替换*remote.origin.fetch所要求的分支(可能是“主”),然后git fetch

    从 git 1.7.10 开始,git clone提供的--single-branch选项似乎只是为此目的而添加的,而且看起来很简单。

    但是请注意,由于分支通常共享其大部分历史记录,因此仅克隆一部分分支的收益可能比您想象的要小。

You can also do a shallow clone of only selected subset of branches.

您还可以仅对选定的分支子集进行浅层克隆。

If you know how people will want to break things down by filepath (multiple projects in the same repository) you can use submodules (sort of like svn:externals) to pre-split the repo into separately cloneable portions.

如果您知道人们希望如何通过文件路径(同一存储库中的多个项目)分解事物,您可以使用子模块(有点像 svn:externals)将存储库预先拆分为可单独克隆的部分。

回答by Ry4an Brase

In mercurial land you're talking about three different types of partial clones:

在 mercurial land 你谈论三种不同类型的部分克隆:

  • shallow clones: I want the history from revision point X onward use the remotefilelog extension
  • partial clones by filepath: I want all revision history in directory /path with experimental narrowhg extensionor I want only files in directory /path to be in my working directory with experimental sparse extension(shipped since version 4.3, see hg help sparse).
  • partial clones by branch: I want all revision history on branch Y: use clone -r
  • 浅克隆:我希望从修订点 X 开始的历史记录使用remotefilelog 扩展
  • 按文件路径进行部分克隆:我希望目录 /path 中的所有修订历史记录都带有实验性的缩小扩展名,或者我只希望目录 /path 中的文件位于我的工作目录中并带有实验性稀疏扩展名(从 4.3 版开始提供,请参阅参考资料hg help sparse)。
  • 按分支进行部分克隆:我想要分支 Y 上的所有修订历史记录:使用 clone -r

If you know how people will want to break things down by filepath (multiple projects in the same repo (shame on you)) you can use subrepositories (sort of like svn externals) to pre-split the repo into separately cloneable portions

如果您知道人们希望如何通过文件路径(同一个 repo 中的多个项目(对您感到羞耻))分解事物,您可以使用子存储库(有点像 svn externals)将 repo 预先拆分为单独的可克隆部分

Also, as to the "so huge I'd like to only get a part of it": You really only have to do that one time ever. Just clone it while you have lunch, and then you have it forever more. Subsequently you can pulland get deltas efficiently going forward. And if you want another clone of it, just clone your first clone. Where you got a clone doesn't matter (and local clones take up no additional diskspace since they're hard links under the covers).

此外,至于“如此之大,我只想得到它的一部分”:你真的只需要这样做一次。只需在午餐时克隆它,然后就可以永远拥有它。随后,您可以pull有效地获得增量。如果您想要另一个克隆,只需克隆您的第一个克隆。在哪里获得克隆并不重要(并且本地克隆不占用额外的磁盘空间,因为它们是隐藏的硬链接)。

回答by nobar

The selected answer provides a good overview, but lacks a complete example.

所选答案提供了很好的概述,但缺少完整的示例。

Minimize your download and checkout footprint (a), (b):

最小化您的下载和结帐足迹 (a), (b)

git clone --no-checkout --depth 1 --single-branch --branch (name) (repo) (folder)
cd (folder)
git config core.sparseCheckout true
echo "target/path/1" >>.git/info/sparse-checkout
echo "target/path/2" >>.git/info/sparse-checkout
git checkout

Periodically optimize your local repository footprint (c)(optional, use with care):

定期优化您的本地存储库占用空间 (c)(可选,谨慎使用):

git clean --dry-run # consider and tweak results then switch to --force
git gc
git repack -Ad
git prune

See also: How to handle big repositories with git

另请参阅:如何使用 git 处理大型存储库

回答by rossmic

This method creates an unversioned archive without subrepositories:

此方法创建一个没有子存储库的无版本存档:

hg clone -U ssh://machine//directory/path/to/repo/project projecttemp

cd projecttemp

hg archive -r tip ../project-no-subrepos

The unversioned source code without the subrepositoies is in the project-no-subrepos directory

没有子存储库的未版本化源代码位于 project-no-subrepos 目录中

回答by user7610

Regarding Git it might be of a historical significance that Linus Torvalds answered this question from the conceptual perspective back in 2007 in a talk that was recorded and is available online.

关于 Git,Linus Torvalds 在 2007 年的一次演讲中从概念的角度回答了这个问题,这可能具有历史意义,该演讲已被录制并可在线获取。

The question is whether it is possible to check out only some files out of a Git repository.

问题是是否可以只检出 Git 存储库中的一些文件。

Tech Talk: Linus Torvalds on git t=43:10

技术讲座:Linus Torvalds on git t=43:10

To summarize, he said that one of the design decisions of Git that sets it apart from other source management systems (he cites BitKeeper and SVN) is that Git manages content, not files. The implications being that e.g. a diff of a subset of files in two revisions is computed by first taking the whole diff and then pruning it only to the files that were requested. Another is that you have to check out the whole history; in an all or nothing fashion. For this reason, he suggests splitting loosely related components among multiple repositories and mentions a then ongoing effort to implement an user interface for managing a repository that is structured as a super-project holding smaller repositories.

总而言之,他说 Git 将其与其他源代码管理系统(他引用 BitKeeper 和 SVN)区分开来的设计决策之一是 Git 管理内容,而不是文件。其含义是,例如,通过首先获取整个差异然后仅将其修剪到所请求的文件来计算两个修订中的文件子集的差异。另一个是你必须查看整个历史;以全有或全无的方式。出于这个原因,他建议将松散相关的组件拆分到多个存储库中,并提到了当时正在努力实现用于管理存储库的用户界面,该存储库结构为包含较小存储库的超级项目。

As far as I know this fundamental design decision still apples today. The super-project thing probably became what now are submodules.

据我所知,这个基本的设计决定今天仍然适用。超级项目的东西可能变成了现在的子模块

回答by Dan Christian

In mercurial, you should be able to so some of this using:

在 mercurial 中,您应该能够使用以下方法:

hg convert --banchmap FILE SOURCEDEST REVMAP

You may also want:

您可能还想要:

--config convert.hg.startrev=REV

The source can be git, mercurial, or a variety of other systems.

源可以是 git、mercurial 或各种其他系统。

I haven't tried it, but convert is quite rich.

我还没有尝试过,但转换相当丰富。