解决“git svn clone”失败的方法(需要完整的历史记录)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12161541/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 14:28:00  来源:igfitidea点击:

Work-around for failing "git svn clone" (requiring full history)

gitsvngit-svn

提问by FooF

I want to convert a Subversion repository sub-directory (denoted by modulehere) into a git repository with full history. There are many svn copyoperations (Subversion people call them branches) in the history of my Subversion repository. The release policy has been that after each release or other branches created, the old URL is left unused and the new URL replaces the old one for containing the work.

我想将 Subversion 存储库子目录(由module此处表示)转换为具有完整历史记录的 git 存储库。在svn copy我的 Subversion 存储库的历史中,有许多操作(Subversion 人称它们为分支)。发布策略是在每个版本或其他分支创建后,旧 URL 未被使用,新 URL 替换旧 URL 以包含工作。

Optimally, by my reading, it seems like this should do the trick:

最理想的是,通过我的阅读,这似乎应该可以解决问题:

$ git svn clone --username=mysvnusername --authors-file=authors.txt \
    --follow-parent \
    http://svnserver/svn/src/branches/x/y/apps/module module

(where branches/x/y/depicts the newest branch). But I got an error, which looks something like this:

(其中branches/x/y/描绘了最新的分支)。但是我收到了一个错误,看起来像这样:

W: Ignoring error from SVN, path probably does not exist: (160013): Filesystem has no item: '/svn/src/!svn/bc/100/branches/x/y/apps/module' path not found
W: Do not be alarmed at the above message git-svn is just searching aggressively for old history.

(Update:Adding option --no-minimize-urlto the above does not remove the error message.)

更新:--no-minimize-url上述添加选项不会删除错误消息。)

The directory moduleget created and populated, but the Subversion history past the newest svn copycommit is not imported (the git repository created ends up having just two commits when I expected hundreds).

该目录module被创建并填充,但未svn copy导入最新提交后的 Subversion 历史记录(创建的 git 存储库最终只有两次提交,而我预计会有数百次提交)。

The question is, how to export the full Subversion history in the presence of this situation?

问题是,在这种情况下如何导出完整的Subversion历史?

Possible Cause

可能的原因

  1. Searching for the error message, I found this: git-svn anonymous checkout fails with -swhich linked to this Subversion issue: http://subversion.tigris.org/issues/show_bug.cgi?id=3242

    What I understand by my reading, something in Subversion 1.5 changed about how the client accesses the repository. With newer Subversion, if there is no read access to some super directory of the URL path (true for me, svn ls http://svnserver/svnfails with 403 Forbidden), then we fail with some Subversion operations.

  2. Jeff Fairley in his answer points out that spaces in the Subversion URL might also cause this error message (confirmed by user Owen). Have a look at his solution to see how he solved the case if your git svn cloneis failing for the same resson.

  3. Dejay Clayton in his answer reveals that if the deepest subdirectory components in branch and tag svn urls are equally named (e.g. .../tags/release/1.0.0and .../branches/release-candidates/1.0.0) then this error could occur.

  1. 搜索错误消息,我发现:git-svn 匿名结账失败, -s链接到这个 Subversion 问题:http: //subversion.tigris.org/issues/show_bug.cgi?id= 3242

    我通过阅读了解到,Subversion 1.5 中的某些内容改变了客户端访问存储库的方式。使用较新的 Subversion,如果对 URL 路径的某个超级目录没有读访问权限(对我来说是这样,svn ls http://svnserver/svn失败了403 Forbidden),那么我们的一些 Subversion 操作就会失败。

  2. Jeff Fairley 在他的回答中指出 Subversion URL 中的空格也可能导致此错误消息(由用户 Owen 确认)。看看他的解决方案,看看他git svn clone是如何解决这个问题的,如果你在同样的问题上失败了。

  3. Dejay Clayton 在他的回答中透露,如果 branch 和 tag svn url 中最深的子目录组件具有相同的名称(例如.../tags/release/1.0.0.../branches/release-candidates/1.0.0),则可能会发生此错误。

采纳答案by Dejay Clayton

I ran into this problem when I had identically-named subdirectories within branches or tags.

当我在分支或标签中有同名的子目录时,我遇到了这个问题。

For example, I had tags candidates/1.0.0and releases/1.0.0, and this caused the documented error because subdirectory 1.0.0appears within both candidatesand releases.

例如,我有标签candidates/1.0.0and releases/1.0.0,这导致了记录错误,因为子目录同时1.0.0出现在candidatesand 中releases

Per git-svn docs:

每个git-svn 文档

When using multiple --branches or --tags, git svn does not automatically handle name collisions (for example, if two branches from different paths have the same name, or if a branch and a tag have the same name). In these cases, use init to set up your Git repository then, before your first fetch, edit the $GIT_DIR/config file so that the branches and tags are associated with different name spaces.

当使用多个 --branches 或 --tags 时,git svn 不会自动处理名称冲突(例如,如果来自不同路径的两个分支具有相同的名称,或者如果一个分支和一个标记具有相同的名称)。在这些情况下,使用 init 设置您的 Git 存储库,然后在您第一次获取之前,编辑 $GIT_DIR/config 文件,以便分支和标签与不同的名称空间相关联。

So while the following command failed due to similarly named candidatesand releasestags:

因此,虽然以下命令由于名称candidatesreleases标签相似而失败:

git svn clone --authors-file=../authors.txt --no-metadata \
    --trunk=/trunk --branches=/branches --tags=/candidates \
    --tags=/releases --tags=/tags -r 100:HEAD \
    --prefix=origin/ \
    svn://example.com:3692/my-repos/path/to/project/

the following sequence of commands did work:

以下命令序列确实有效:

git svn init --no-metadata \
    --trunk=/trunk --branches=/branches --tags=/tags \
    --prefix=origin/ \
    'svn://example.com:3692/my-repos/path/to/project/'

git config --add svn-remote.svn.tags \
    'path/to/project/candidates/*:refs/remotes/origin/tags/Candidates/*'

git config --add svn-remote.svn.tags \
    'path/to/project/releases/*:refs/remotes/origin/tags/Releases/*'

git svn fetch --authors-file=../authors.txt -r100:HEAD

Note that this only worked because there were no other conflicts within branchesand tags. If there were, I would have had to resolve them similarly.

需要注意的是,因为有内没有其他冲突这只是工作branchestags。如果有,我将不得不以类似的方式解决它们。

After successfully cloning the SVN repository, I then executed the following steps in order to: turn SVN tags into GIT tags; turn trunkinto master; turn other references into branches; and relocate remote paths:

成功克隆 SVN 存储库后,我执行以下步骤: 将 SVN 标签转换为 GIT 标签;转trunkmaster; 将其他引用转换为分支;并重新定位远程路径:

# Make tags into true tags
cp -Rf .git/refs/remotes/origin/tags/* .git/refs/tags/
rm -Rf .git/refs/remotes/origin/tags

# Make other references into branches
cp -Rf .git/refs/remotes/origin/* .git/refs/heads/
rm -Rf .git/refs/remotes/origin
cp -Rf .git/refs/remotes/* .git/refs/heads/ # May be missing; that's okay
rm -Rf .git/refs/remotes

# Change 'trunk' to 'master'
git checkout trunk
git branch -d master
git branch -m trunk master

回答by mliebelt

Not a full answer, but perhaps the snippet you are missing (I am interested in migrating as well, so I have found that part of the puzzle).

不是完整的答案,但可能是您缺少的片段(我也对迁移感兴趣,因此我找到了难题的一部分)。

When you look at the documentation of git-svn, you will find the following option:

当您查看git-svn文档时,您会发现以下选项:

--no-minimize-url 

When tracking multiple directories (using --stdlayout, --branches, or --tags options), git svn will attempt to connect to the root (or highest allowed level) of the Subversion repository. This default allows better tracking of history if entire projects are moved within a repository, but may cause issues on repositories where read access restrictions are in place. Passing --no-minimize-url will allow git svn to accept URLs as-is without attempting to connect to a higher level directory. This option is off by default when only one URL/branch is tracked (it would do little good).

当跟踪多个目录(使用 --stdlayout、--branches 或 --tags 选项)时,git svn 将尝试连接到 Subversion 存储库的根目录(或允许的最高级别)。如果整个项目在存储库中移动,此默认设置允许更好地跟踪历史记录,但可能会导致存在读取访问限制的存储库出现问题。传递 --no-minimize-url 将允许 git svn 按原样接受 URL,而无需尝试连接到更高级别的目录。默认情况下,当仅跟踪一个 URL/分支时,此选项处于关闭状态(这没什么好处)。

This fits to the situation you have, so that git svndoes not try to read a higher level of the directory tree (which will be blocked).

这适合您的情况,因此git svn不会尝试读取目录树的更高级别(这将被阻止)。

At least you could give it a try ...

至少你可以试一试...

回答by Jeff Fairley

I recently migrated a long list of SVN repositories into Git and towards the end ran into this problem. Our SVN structure was pretty sloppy, so I had to use --no-minimize-urlquite a bit. Typically, I'd run a command like:

我最近将一长串 SVN 存储库迁移到 Git 中,最后遇到了这个问题。我们的 SVN 结构非常草率,所以我不得不使用--no-minimize-url很多。通常,我会运行如下命令:

$ git svn clone http://[url]/svn/[repo]/[path-to-code] \
            -s --no-minimize-url \
            -A authors.txt

The last few migrations I ran had a space in the URL. I don't know if it was the space or something else, but I was getting the same error you were seeing. I didn't want to get into modifying config files if I didn't have to, and luckily I ended up finding a solution. I ended up skipping the -s --no-minimize-urloptions in favor of explicitly declaring the paths differently.

我运行的最后几次迁移在 URL 中有一个空格。我不知道是空间还是其他原因,但我遇到了与您看到的相同的错误。如果没有必要,我不想修改配置文件,幸运的是我最终找到了解决方案。我最终跳过了-s --no-minimize-url选项,转而以不同的方式明确声明路径。

$ git svn clone http://[url]/svn/[repo]/ \
            --trunk="/[path-to-code]/trunk" \
            --branches="/[path-to-code]/branches" \
            --tags="/[path-to-code]/tags" \
            -A authors.txt \
            --follow-parent
  • Note that I added --follow-parentfrom your example, but I'm also not sure that it made any difference.
  • Remember that these repos had spaces in them, hence the ""around the trunk/branches/tags paths.
  • 请注意,我是--follow-parent从您的示例中添加的,但我也不确定它是否有任何不同。
  • 请记住,这些存储库中有空间,因此""在主干/分支/标签路径周围。

回答by Owen

[I realize this should be a comment on Jeff Fairley's answer but I don't have the reputation to post it as such. Since the original poster did ask for confirmation the approach worked I'm providing it as an answer.]

[我意识到这应该是对 Jeff Fairley 回答的评论,但我没有将其发布的声誉。由于原始海报确实要求确认该方法有效,因此我将其作为答案提供。]

I can confirm that his solution works for the problem he (and I) ran into caused by spaces in the path. I had the same requirements (clone a single module from an SVN repo with history) except that I had no branches or tags to worry about whatsoever.

我可以确认他的解决方案适用于他(和我)遇到的由路径中的空格引起的问题。我有相同的要求(从具有历史记录的 SVN 存储库克隆单个模块),只是我没有任何分支或标签需要担心。

I tried several permutations of providing the full path to the module in the URL (e.g. using --no-minimise-url, specifying --trunkor --stdlayout) with no success. For me the result was usually a git repo with a full history log but no files whatsoever. This may or may not be the same problem FooF encountered (no read access in SVN) but it was certainly caused by having a space in the path to my module.

我尝试了几种在 URL 中提供模块完整路径的排列方式(例如使用--no-minimise-url、指定--trunk--stdlayout),但没有成功。对我来说,结果通常是一个带有完整历史日志但没有任何文件的 git repo。这可能是也可能不是 FooF 遇到的相同问题(在 SVN 中没有读取访问权限),但这肯定是由于我的模块路径中有空格引起的。

Trying again with only the SVN repo base as the URL and the path to my module in --trunkworked flawlessly. Afterwards my .git/config looks like this:

仅使用 SVN 存储库库作为 URL 和我的模块的路径再次尝试--trunk完美运行。之后我的 .git/config 看起来像这样:

[core]
        repositoryformatversion = 0
        filemode = false
        bare = false
        loggallrefupdates = true
        symlinks = false
        ignorecase = true
        hideDotFiles = dotGitOnly
[svn-remote "svn"]
        url = https://[url]/svn/[repo]
        fetch = trunk/[path-to-code]:refs/remotes/trunk
[svn]
        authorsfile = ~/working/authors-transform.txt

and subsequent gitand git svncommands are throwing no errors at all. Thanks Jeff!

随后的gitgit svn命令根本没有抛出任何错误。谢谢杰夫!

回答by FooF

[This is the original poster speakingwriting. The below used to be update to the question, but as it solved the case - albeit unsatisfactorily to my taste - I will post it as an answer lacking a better solution.]

[这是原海报口语写作。下面曾经是对问题的更新,但随着它解决了这个问题——尽管我的口味不尽如人意——我会将它作为缺乏更好解决方案的答案发布。]

I do not like this, but I ended up doing clonesplitted into initand fetchwith some editing of .git/configbetween (repopath=apps/module, gitreponame=module):

我不喜欢这样,但我最终做了clone拆分init并在 ( , )之间进行fetch了一些编辑:.git/configrepopath=apps/modulegitreponame=module

$ git svn init--username=mysvnusername \
            --branches=/src/branches/ \
            --trunk=/src/trunk/${repopath} \
            --tags=/src/tags/ \
            http://svnserver/svn/src ${gitreponame}
$ cd ${gitreponame}
$ sed -i.bak "s|*:|*/${repopath}:|" .git/config
$ git svn fetch --authors-file=../authors.txt --follow-parent

I could not find how to specify the branches for subdirectory migration with git svn- hence the editing of the .git/configfile. The following unified diff illustrates the effect of the editing with sed:

我找不到如何为子目录迁移指定分支git svn- 因此无法编辑.git/config文件。下面统一的 diff 说明了编辑的效果sed

 [svn-remote "svn"]
        url = http://svnserver/svn/src
        fetch = trunk/apps/module:refs/remotes/trunk
-       branches = branches/*:refs/remotes/*
-       tags = tags/*:refs/remotes/tags/*
+       branches = branches/*/apps/module:refs/remotes/*
+       tags = tags/*/apps/module:refs/remotes/tags/*

As the actual desired HEADwas in an another URL, I ended just adding another [svn-remote]section to .git/config:

由于实际需要HEAD在另一个 URL 中,我结束只是添加另一个[svn-remote]部分到.git/config

+ [svn-remote "svn-newest"]
+       url = http://svnserver/svn/src
+       fetch = branches/x/y/apps/module:refs/remotes/trunk
+       branches = branches/*/apps/module:refs/remotes/*
+       tags = tags/*/apps/module:refs/remotes/tags/*

(in real life experiment I also added here some branches that were not picked up by the first fetch), and fetching again:

(在现实生活中的实验中,我还在此处添加了一些第一次抓取未拾取的分支),然后再次抓取:

$ git svn fetch --authors-file=../authors.txt --follow-parent svn-newest

This way I ended having the full Subversion history migrated to the newly generated git repository.

这样我就结束了将完整的 Subversion 历史迁移到新生成的 git 存储库。

Note-1: I probably could have just told my "trunk" to be branches/x/y/apps/moduleas the meaning of "trunk" for git-svnseems to basically have the meaning of git HEAD(Subversion concepts of trunk, branches, tags have no deep technical basis, they are matter of socially agreed convention).

注意1:我可能只是告诉我的“trunk”是“trunk”branches/x/y/apps/module的意思,因为git-svn似乎基本上具有git的含义HEAD(主干,分支,标签的Subversion概念没有深厚的技术基础,它们很重要社会公认的公约)。

Note-2: probably --follow-parentis not required for git svn fetch, but I have no way of knowing or experimenting now.

注2:可能--follow-parent不需要git svn fetch,但我现在无法知道或尝试。

Note-3: While earlier reading of svn2gitwhich seems to be a wrapper over git-svnI failed to see the motivation, but seeing the messy presentation of tags I kind of get it now. I would try svn2gitnext time if I had to try doing this again.

注 3:虽然较早阅读svn2git似乎是一个包装器git-svn,但我没有看到动机,但看到标签的混乱呈现我现在有点明白了。svn2git如果我不得不再次尝试这样做,我下次会尝试。

P.S.This is rather awkward way of doing the operation. Secondary problem here (why the editing of the .git/configby external was required) seems to be that

PS这是一种相当笨拙的操作方式。这里的次要问题(为什么.git/config需要由外部编辑)似乎是

  1. Subversion branches do not have any essential technical meaning (branchesand tagsin Subversion are just a socially agreed labelsfor a versioned file system copy together with "standard" or otherwise socially agreed convention where the copies are done - trunkalso has no technical meaning), and
  2. git svnimplementation strictly assumes the social Subversion conventions to be followed to a degree (which is not possible if you just want to migrate a subdirectory and not the whole Subversion repository).
  1. Subversion 分支没有任何基本的技术意义(Subversion 中的分支标签只是一个社会认可的版本化文件系统副本的标签以及“标准”或其他社会认可的复制约定——主干也没有技术意义) , 和
  2. git svn实现严格假定在一定程度上遵循社会 Subversion 约定(如果您只想迁移子目录而不是整个 Subversion 存储库,这是不可能的)。

TODO:It would be helpful to have the format of the .git/configfile explained here as it relates to git svn- for example I have now (after one and half year of writing the original answer) no idea what the [svn-remote "svn-newest"]means above. Also the approach could be automated by writing a script, but this is beyond my current interest in the problem and I do not have access to the original Subversion repository or replication of the issue.

TODO:.git/config在这里解释文件的格式会很有帮助,因为它涉及到git svn- 例如,我现在(在编写原始答案一年半之后)不知道[svn-remote "svn-newest"]上面的意思是什么。此外,该方法可以通过编写脚本来自动化,但这超出了我目前对问题的兴趣,而且我无权访问原始 Subversion 存储库或问题的复制。