在不丢失历史记录的情况下从 CVS 迁移到 Git

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20869710/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 17:30:38  来源:igfitidea点击:

Migrate from CVS to Git without losing history

gitversion-controlcvs

提问by Ahmed Alaa

I need to know if there is a way to migrate my code from CVS source control to Git?

我需要知道是否有办法将我的代码从 CVS 源代码管理迁移到 Git?

If yes, what about my history of commits?

如果是,那么我的提交历史呢?

采纳答案by John Szakmeister

I've not personally done a conversion from CVS to Git, but I believe Eric Raymond's cvs-fast-exportis the tool to use. He has the man page posted here. cvspsis another tool maintained by Eric, but it has recently been deprecated in favor of cvs-fast-export. cvs2gitis another tool which is built on some of the same machinery as cvs2svn. The latter was extremely adept, and so I have high hopes that cvs2gitis equally good.

我没有亲自完成从 CVS 到 Git 的转换,但我相信 Eric Raymondcvs-fast-export是可以使用的工具。他在此处发布了手册页。 cvsps是 Eric 维护的另一个工具,但它最近已被弃用,以支持cvs-fast-export. cvs2git是另一种工具,它建立在与cvs2svn. 后者非常熟练,所以我寄予厚望cvs2git

One thing to note: CVS is a pretty broken RCS. It's possible that it can have content that can't be reflected exactly in Git. IOW, there is some mismatch impedance there, but the tools try very hard to preserve as much as possible. Make sure to check your conversion and that you're happy with the results. You may need to fixup part of the Git history to get something more acceptable, but I doubt you'll need to.

需要注意的一件事:CVS 是一个非常糟糕的 RCS。它可能包含无法在 Git 中准确反映的内容。IOW,那里有一些不匹配的阻抗,但工具非常努力地尽可能地保留。确保检查您的转换并且您对结果感到满意。您可能需要修复部分 Git 历史以获得更可接受的内容,但我怀疑您是否需要这样做。

回答by gaborous

Here is the process I used to migrate a SourceForge CVS repo to Git using cvs2git(latest stable release is here, but IIRC I used the github dev version), which works on both Windows and Linux without any compilation required since it's just Python:

这是我使用cvs2git将 SourceForge CVS 存储库迁移到 Git 的过程最新稳定版本在这里,但 IIRC 我使用了 github dev 版本),它适用于 Windows 和 Linux,无需任何编译,因为它只是 Python:

How to import from sourceforge CVS to git.
First, you need to download/checkout the cvs repo with the whole history (not just checkout the HEAD/Trunk):

如何从 sourceforge CVS 导入到 git。
首先,您需要下载/检出包含整个历史记录的 cvs 存储库(不仅仅是检出 HEAD/Trunk):

rsync -av rsync://PROJECT.cvs.sourceforge.net/cvsroot/PROJECT/\* cvs  

then use cvs2git (python script, works on all platforms, no compilation needed):

然后使用 cvs2git(python 脚本,适用于所有平台,无需编译):

python cvs2git --blobfile="blob.dat" --dumpfile="dump.dat" --username="username_to_access_repo" --options=cvs2git.options --fallback-encoding utf-8 cvs  

this should have generated two files blob and dump containing all your cvs history. You can open them in a text editor to check that the content seems correct.

then initialize your git repo inside another folder:

这应该会生成两个文件 blob 和 dump ,其中包含您的所有 cvs 历史记录。您可以在文本编辑器中打开它们以检查内容是否正确。

然后在另一个文件夹中初始化您的 git repo:

mkdir gitexport/
cd gitexport
git init  

then load up the exported cvs history onto git:

然后将导出的 cvs 历史加载到 git 上:

cat ../{blob,dump}.dat | git fast-import  

and then place the git commit cursor at the end of history:

然后将 git commit 光标放在历史记录的末尾:

git reset --hard  

finally and optionally, you can push to your remote git repository:

最后,您可以选择推送到远程 git 存储库:

git push -u origin master  

of course you need before to git remote add origin https://your_repo_url

当然你需要之前 git remote add origin https://your_repo_url

Note: cvs2git.optionsis a JSON formatted configuration file for cvs2gitwhere you can specify transforms for various things like author names (so that their nicknames will be automagically transformed to their full name after import). See the documentation hereor the included example options file.

注意:cvs2git.options是一个 JSON 格式的配置文件cvs2git,您可以在其中指定作者姓名等各种内容的转换(以便他们的昵称在导入后会自动转换为他们的全名)。请参阅此处文档包含的示例选项文件

Also you don't need to own the repowith this method, you can migrate SourceForge projects that you don't own (you just need the right to checkout, so this works on any public repo).

此外,您不需要使用此方法拥有 repo,您可以迁移您不拥有的 SourceForge 项目(您只需要签出的权利,因此这适用于任何公共 repo)。

回答by Chris

You can use git-cvsimportto import your CVS repository into Git. By default, this will check out every revision, giving you a relatively complete history.

您可以使用git-cvsimport将 CVS 存储库导入 Git。默认情况下,这将检查每个修订版,为您提供相对完整的历史记录。

Depending on your operating system, you may need to install support for this separately. For example, on an Ubuntu machine you would need the git-cvspackage.

根据您的操作系统,您可能需要单独安装对此的支持。例如,在 Ubuntu 机器上,您将需要该git-cvs软件包。

This answergoes into more detail.

这个答案更详细。

回答by crististm

I've used recently (2016) reposurgeonof Eric Raymond to import a CVS repo from sourceforge to git. I was very pleasantly surprised and it worked very well. After past experiences with cvs2svn and other tools, I recommend without hesitation reposurgeon for this kind of tasks.

我最近(2016 年)使用Eric Raymond 的reposurgeon将 CVS 存储库从 sourceforge 导入到 git。我感到非常惊喜,而且效果很好。在过去使用 cvs2svn 和其他工具的经验之后,我毫不犹豫地推荐 reposurgeon 来完成此类任务。

Eric has posted a straightforward migration guide here

Eric在这里发布了一个简单的迁移指南

回答by VonC

gaborous's answeruses git fast-import, which could fails on log message notencoded in UTF-8.

gaborous答案使用git fast-import,这可能会在以 UTF-8 编码的日志消息上失败。

That will work better with Git 2.23 (Q2 2019): The "git fast-export/import" pair has been taught to handle commits with log messages in encoding other than UTF-8 better.

这将在 Git 2.23(2019 年第二季度)中更好地工作:git fast-export/import” 对已被教导以更好地处理使用非 UTF-8 编码的日志消息的提交。

See commit e80001f, commit 57a8be2, commit ccbfc96, commit 3edfcc6, commit 32615ce(14 May 2019) by Elijah Newren (newren).
(Merged by Junio C Hamano -- gitster--in commit 66dc7b6, 13 Jun 2019)

参见Elijah Newren ( ) 的commit e80001fcommit 57a8be2commit ccbfc96commit 3edfcc6commit 32615ce(2019 年 5 月 14 日(由Junio C Hamano合并-- --66dc7b6 提交中,2019 年 6 月 13 日)newren
gitster

fast-export: do automatic reencoding of commit messages only if requested

Automatic re-encoding of commit messages (and dropping of the encoding header) hurts attempts to do reversible history rewrites (e.g. sha1sum <-> sha256sum transitions, some subtree rewrites), and seems inconsistent with the general principle followed elsewhere in fast-exportof requiring explicit user requests to modify the output (e.g. --signed-tags=strip, --tag-of-filtered-object=rewrite).
Add a --reencodeflag that the user can use to specify, and like other fast-export flags, default it to 'abort'.

fast-export: 仅在需要时才对提交消息进行自动重新编码

提交消息的自动重新编码(并删除编码头)会损害进行可逆历史重写(例如 sha1sum <-> sha256sum 转换,一些子树重写)的尝试,并且似乎与其他地方遵循fast-export的要求显式用户的一般原则不一致请求修改输出(例如--signed-tags=strip--tag-of-filtered-object=rewrite)。
添加一个--reencode用户可以用来指定的标志,与其他快速导出标志一样,默认为 ' abort'

That means the Documentation/git-fast-exportnow includes:

这意味着Documentation/git-fast-export现在包括:

 --reencode=(yes|no|abort)::

Specify how to handle encodingheader in commit objects.

  • When asking to 'abort' (which is the default), this program will die when encountering such a commit object.
  • With 'yes', the commit message will be reencoded into UTF-8.
  • With 'no', the original encoding will be preserved.

fast-export: avoid stripping encoding header if we cannot reencode

When fast-exportencounters a commit with an 'encoding' header, it tries to reencode in UTF-8 and then drops the encoding header.
However, if it fails to reencode in UTF-8 because e.g. one of the characters in the commit message was invalid in the old encoding, then we need to retain the original encoding or otherwise we lose information needed to understand all the other (valid) characters in the original commit message.

fast-import: support 'encoding' commit header

Since git supports commit messages with an encoding other than UTF-8, allow fast-importto import such commits.
This may be useful for folks who do not want to reencode commit messages from an external system, and may also be useful to achieve reversible history rewrites (e.g. sha1sum <-> sha256sum transitions or subtree work) with Git repositories that have used specialized encodings in their commit history.

指定如何处理encoding提交对象中的标头。

  • 当询问 ' abort'(这是默认值)时,这个程序会在遇到这样的提交对象时死亡。
  • 选择“是”,提交消息将被重新编码为 UTF-8。
  • 使用“否”,将保留原始编码。

fast-export: 如果我们不能重新编码,避免剥离编码头

fast-export遇到带有“编码”标头的提交时,它会尝试以 UTF-8 重新编码,然后丢弃编码标头。
但是,如果它无法在 UTF-8 中重新编码,因为例如提交消息中的一个字符在旧编码中无效,那么我们需要保留原始编码,否则我们会丢失理解所有其他(有效)所需的信息原始提交消息中的字符。

fast-import: 支持“编码”提交标头

由于 git 支持使用非 UTF-8 编码的提交消息,因此允许fast-import导入此类提交。
这对于不想从外部系统重新编码提交消息的人来说可能很有用,并且对于使用在他们的提交历史。

The Documentation/git-fast-importnow includes:

Documentation/git-fast-import现在包括:

encoding`

The optional encodingcommand indicates the encoding of the commit message.
Most commits are UTF-8 and the encoding is omitted, but this allows importing commit messages into git without first reencoding them.

编码`

可选encoding命令指示提交消息的编码。
大多数提交是 UTF-8 并且省略了编码,但这允许将提交消息导入到 git 中,而无需先重新编码它们。



To see that testwhich uses an author with non-ascii characters in the name, but no special commit message.
It does check that the reencoding into UTF-8 worked, by checking its size:

查看 使用名称中包含非 ascii 字符的作者的测试,但没有特殊的提交消息。
它确实通过检查其大小来检查重新编码为 UTF-8 是否有效:

The commit object, if not re-encoded, would be 240 bytes.

  • Removing the "encoding iso-8859-7\n" header drops 20 bytes.
  • Re-encoding the Pi character πfrom \xF0(\360) in iso-8859-7 to \xCF\x80(\317\200) in UTF-8 adds a byte.

Check for the expected size.

如果未重新编码,提交对象将为 240 字节。

  • 删除 " encoding iso-8859-7\n" 标头会减少 20 个字节。
  • 重新编码裨字符π\xF0\360)的异8859-7至\xCF\x80\317\200)以UTF-8添加一个字节。

检查预期的大小。

回答by Mugeesh Husain

Migration from CVS to Git using cvs2svn

使用cvs2svn从 CVS 迁移到 Git

Sharing all step for migration CVS to git

1. create directory a cvsProject in anyDir

Rsync: your cvs repo:  
 1. $rsync -av  CVSUserName@CVSipAdrress:/CVS_Path/ProjectName/*  ~/anyDir/ProjectName

2. cd $../cvs2svn-x.x.0 && ./cvs2git --options=cvs2git-example.options
3. $./cvs2git --blobfile=cvs2git-tmp/git-blob.dat \ --dumpfile=cvs2git-tmp/git-dump.dat \ --username=CVS_YOUR_USER_NAME \ /path_of_step(1)/cvsProject
Note: if get any encoding error then add this into above command:"--encoding=ascii --encoding=utf8 --encoding=utf16 --encoding=latin"
4. mkdir newGitRepo && cd newGitRepo 5. git init --bare 6. git fast-import --export-marks=/x.x.x/cvs2svn-2.5.0/cvs2git-tmp/git-marks.dat \

wow now you are done, now you can push your repo to git..

Referenece : [link1][2] ,[link2][2]

分享将 CVS 迁移到 git 的所有步骤

1.在anyDir中创建一个cvsProject目录

Rsync: your cvs repo:  
 1. $rsync -av  CVSUserName@CVSipAdrress:/CVS_Path/ProjectName/*  ~/anyDir/ProjectName

2. cd $../cvs2svn-x.x.0 && ./cvs2git --options=cvs2git-example.options
3. $./cvs2git --blobfile=cvs2git-tmp/git-blob.dat \ --dumpfile=cvs2git-tmp/git-dump.dat \ --username=CVS_YOUR_USER_NAME \ /path_of_step(1)/cvsProject
Note: if get any encoding error then add this into above command:"--encoding=ascii --encoding=utf8 --encoding=utf16 --encoding=latin"
4. mkdir newGitRepo && cd newGitRepo 5. git init --bare 6. git fast-import --export-marks=/x.x.x/cvs2svn-2.5.0/cvs2git-tmp/git-marks.dat \

wow now you are done, now you can push your repo to git..

Referenece : [link1][2] ,[link2][2]

回答by rubo77

In order to clone a project from sourceforge to github I performed the following steps.

为了将项目从 sourceforge 克隆到 github,我执行了以下步骤。

PROJECT=some_sourceforge_project_name
GITUSER=rubo77
rsync -av rsync://a.cvs.sourceforge.net/cvsroot/$PROJECT/\* cvs
svn export --username=guest http://cvs2svn.tigris.org/svn/cvs2svn/trunk cvs2svn-trunk
cp ./cvs2svn-trunk/cvs2git-example.options ./cvs2git.options
vim cvs2git.options # edit run_options.set_project
cvs2svn-trunk/cvs2git --options=cvs2git.options --fallback-encoding utf-8

create an empty git at https://github.com/$GITUSER/$PROJECT.git

在以下位置创建一个空的 git https://github.com/$GITUSER/$PROJECT.git

git clone [email protected]:$GITUSER/$PROJECT.git $PROJECT-github
cd $PROJECT-github
cat ../cvs2git-tmp/git-{blob,dump}.dat | git fast-import
git log
git reset --hard
git push