使用 Git 写论文
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7775881/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using Git for writing thesis
提问by nixnotwin
I am planning to use Git for writing my thesis with Latex. As Git is specifically designed for software development, would it be feasible for my requirements? If it is a good choice for me, then what special and unique features are available in Git which are ideal for writing a thesis. Also I want to know what precautions I should take before getting into the Git work flow. I am a complete beginner for Git, so what should be my starting point before I get into it.
我打算用 Git 用 Latex 写我的论文。由于 Git 是专为软件开发而设计的,它是否符合我的要求?如果它对我来说是一个不错的选择,那么 Git 中有哪些特殊和独特的功能非常适合写论文。另外我想知道在进入 Git 工作流程之前我应该采取哪些预防措施。我是一个完全的 Git 初学者,所以在我进入它之前我的起点应该是什么。
采纳答案by gpoo
There are some technical considerations and best practices. I am going for the second one, specifically for writing your thesis and/or papers. For the technical ones, you can check any git tutorial.
有一些技术注意事项和最佳实践。我要去第二个,特别是写你的论文和/或论文。对于技术方面的,您可以查看任何 git 教程。
Define the directory structure for your thesis. You can change it later, and use git for tracking the changes. Having a good structure would make your life easier.
Work with multiple files (use include and/or input in LaTeX). You can split them by chapters or sections. This will make easier to track changes that involve specific parts of your thesis (e.g.
git log content/introduction.tex
).Track only the files you are going to touch, not the ones auto-generated. Creating a proper .gitignorefile will help you a lot(LaTeX generate plenty of working files).
As in programs, do micro-commits, that is: one commit per idea/feature/fix/activity.
Every time you commit, write meaningful messages (high level) that explains what you were trying to achieve in every change. After a week you might not remember what you tried to accomplish.
Keeping track of every activity/idea/fix [see (4) and (5)] could be very helpful to know how much you have done (using
git log
). You can write your advance report for your supervisor(s) based ongit log
. Even more, you can share the repository with your supervisor (using a web interface), and they can check whatever you have been doing in your thesis. For the next meeting, they will know what to expect (it will depend on how fond are your supervisors on following a RSS).Using git will be useful for keeping you in a good mood (sometimes you would feel you have not done too much, but having track of every change will help you to keep things in perspective).
For every progress report you send, create a tag. For the next report, you can checkout both version and apply latexdiff. It will be useful for tracking changes between versions you submit for revision. This also will help you to check if you addressed the feedback you received for the previous report.
定义论文的目录结构。您可以稍后更改它,并使用 git 来跟踪更改。拥有良好的结构将使您的生活更轻松。
处理多个文件(在 LaTeX 中使用包含和/或输入)。您可以按章节或部分拆分它们。这将更容易跟踪涉及论文特定部分的更改(例如
git log content/introduction.tex
)。仅跟踪您将要接触的文件,而不是自动生成的文件。创建一个合适的 .gitignore文件会对你有很大帮助(LaTeX 生成大量工作文件)。
就像在程序中一样,进行微提交,即:每个想法/功能/修复/活动提交一次。
每次提交时,编写有意义的消息(高级别),解释您在每次更改中尝试实现的目标。一周后,你可能不记得你试图完成什么。
跟踪每个活动/想法/修复 [参见 (4) 和 (5)] 可能非常有助于了解您做了多少(使用
git log
)。您可以根据 为您的主管编写预先报告git log
。更重要的是,您可以与您的主管共享存储库(使用 Web 界面),他们可以检查您在论文中所做的任何事情。对于下一次会议,他们将知道会发生什么(这取决于您的主管对关注 RSS 的喜好程度)。使用 git 将有助于让你保持良好的心情(有时你会觉得你没有做太多,但跟踪每一个变化将帮助你保持正确的观点)。
为您发送的每份进度报告创建一个标签。对于下一个报告,您可以检查两个版本并应用latexdiff。这对于跟踪您提交进行修订的版本之间的更改非常有用。这也将帮助您检查您是否解决了上次报告中收到的反馈。
At last but not least, I recommend you to read "A successful Git branching model". It is a very short article on a git workflow. You can apply the same concepts when you write your thesis. For instance, if you are writing an experiment, you can create a branch for it, and merge it once it is "ready." If you have to revisit it later, it would be easier to see what were the changes involved and why.
最后但并非最不重要的是,我建议您阅读“一个成功的 Git 分支模型”。这是一篇关于 git 工作流程的非常简短的文章。您可以在撰写论文时应用相同的概念。例如,如果您正在编写一个实验,您可以为它创建一个分支,并在“准备好”后合并它。如果您以后必须重新访问它,则更容易了解所涉及的更改以及原因。
回答by Mark Longair
When I was writing my PhD thesis,1 I used git to manage the document and all its figures, and I'm very glad that I did so, not least because it makes it easy to write a script that graphs your progresssas you're going along ;) The chief advantages I found were:
当我在写博士论文时,1 我使用 git 来管理文档及其所有图表,我很高兴我这样做了,尤其是因为它可以轻松编写一个脚本,将你的进度绘制成图表”继续 ;) 我发现的主要优点是:
- Since git is a distributed version control system, it's easy to work on multiple machines. If you need the latest version from your laptop on your desktop machine, you can just
pull
directly from the laptop and work there. When you leave, you go to your laptop and pull from the desktop machine. - If you work on multiple machines, you effectively have a recent backup of your work (including its complete history), and if you want to create further backups you can just push to a new bare repository elsewhere (as VonC's answerpoints out).
- You can make large changes to your document knowing that the previous version is securely stored, and that if you want to retrieve the old version, that's easy to do.
- Being able to commit to your repository when you're offline is very useful, particularly since nothaving internet access makes it much easier to write ;) I also kept PDFs of all the papers I cited in the same repository to make it easier to work offline, although this vastly inflated the repository, so some might advice against that.
- 由于 git 是分布式版本控制系统,因此很容易在多台机器上工作。如果您需要台式机上笔记本电脑的最新版本,您可以
pull
直接从笔记本电脑上工作。当你离开时,你去你的笔记本电脑并从台式机上拉出来。 - 如果您在多台机器上工作,您实际上拥有您工作的最近备份(包括其完整历史记录),并且如果您想创建更多备份,您只需推送到其他地方的新裸存储库(如VonC 的回答所指出的那样)。
- 您可以对您的文档进行大量更改,因为您知道以前的版本是安全存储的,而且如果您想检索旧版本,这很容易做到。
- 能够在离线时提交到您的存储库非常有用,特别是因为没有互联网访问使编写更容易;) 我还将我引用的所有论文的 PDF 文件保存在同一个存储库中,以便于工作离线,尽管这极大地膨胀了存储库,因此有些人可能会建议不要这样做。
The chief advice that I'd give:
我给出的主要建议是:
- Commit frequently, and always make sure that you keep the output of
git status
empty, either by adding files you need, or listing them in.gitignore
. You don't want to risk having important files untracked. - Never use history rewriting commands (e.g.
git rebase
), just to be safe and never use git's dangerous commands likegit reset --hard
andgit checkout -f
. No one will ever see your complete repository, so you don't care what the history looks like - it's much more important that you don't do anything that might lose (or make it more difficult to retrieve) your work. - When you're looking at differences between your versions, use the
--color-words
option togit diff
. Otherwise, your diffs will be line-based, and if you reformat a paragraph in LaTeX, it'll be hard to see what the real changes are -git diff --color-words
ignores the line-breaks, and just shows the old words in red and the new words in green.
- 经常提交,并始终确保
git status
通过添加所需文件或将它们列在.gitignore
. 您不想冒险让重要文件未被跟踪。 - 永远不要使用历史重写命令(例如
git rebase
),只是为了安全起见,永远不要使用 git 的危险命令,例如git reset --hard
和git checkout -f
。没有人会看到您的完整存储库,因此您不关心历史记录是什么样的 - 更重要的是不要做任何可能会丢失(或使其更难检索)您的工作的事情。 - 当你看着你的版本之间的差异,使用
--color-words
选项git diff
。否则,你的差异将是基于行的,如果你在 LaTeX 中重新格式化一个段落,将很难看到真正的变化是什么 -git diff --color-words
忽略换行符,只用红色显示旧单词和新单词绿色。
1 ... with LyXrather than directly in LaTeX, but the issues are essentially the same.
1 ...使用LyX而不是直接在 LaTeX 中,但问题本质上是相同的。
回答by tripleee
This is mainly just meant as a comment, but it turned out a bit too long, so I am posting it as an answer.
这主要只是作为评论,但结果有点太长了,所以我将其发布为答案。
I used darcs for my Master's thesis, and have been using RCS, CVS, SVN, and Git for lots of documentation / writing projects in the past. All of these tools provide the basic feature I want -- ability to review my changes, go back in history, check in "undo points" when I start writing something new.
我在硕士论文中使用了 darcs,并且过去一直在使用 RCS、CVS、SVN 和 Git 进行大量文档/写作项目。所有这些工具都提供了我想要的基本功能——能够查看我的更改、回顾历史、在我开始编写新内容时检查“撤消点”。
There are old and tried recommendations for writing documentation with version control. Using a text-only source format is important for getting sane diffs. In addition, a useful tip I picked up (IIRC from Kernighan, writing about keeping Troff source in version control) is to make sure all lines are reasonably short. I tend to whack enter every few lines, with an eye towards keeping one particular clause or idiom on one line, so that the diff will be minimal if I decide to revise that particular detail later.
有使用版本控制编写文档的旧的和尝试过的建议。使用纯文本源格式对于获得合理的差异很重要。此外,我学到的一个有用的技巧(来自 Kernighan 的 IIRC,写关于将 Troff 源代码保留在版本控制中)是确保所有行都合理地短。我倾向于每隔几行就输入一次,着眼于在一行上保留一个特定的子句或习语,这样如果我决定稍后修改该特定细节,差异就会最小。
回答by imichaelmiers
Git will work. Latex is effectively source code, so it should be perfectly fine.
Git会工作。Latex 是有效的源代码,所以它应该完全没问题。
That said,Git, while awesome, has as slightly steep learning curve because it allows for a lot of things for collaborating with multiple people, handling diverging histories,etc. Its really big advantage is in merging conflicts ( what happens if I change a file and someone else changes a file and we both try to upload/commit it to some server?).
也就是说,Git 虽然很棒,但它的学习曲线有点陡峭,因为它允许与多人协作、处理不同的历史等很多事情。它真正的巨大优势在于合并冲突(如果我更改文件而其他人更改文件并且我们都尝试将其上传/提交到某个服务器会发生什么?)。
If you just want to version your thesis, you are unlikely to even hit the conflicting merge case (since you are the only one editing it), let alone the multiple histories case.
I'd use something simpler like SVN, which while worse for doing the two things I described, fits your needs and is easier to learn.
如果您只想对论文进行版本化,您甚至不太可能遇到冲突的合并案例(因为您是唯一编辑它的人),更不用说多历史案例了。
我会使用像 SVN 这样更简单的东西,虽然做我描述的两件事更糟,但符合您的需求并且更容易学习。
Also, git stores everything in a .git file in the folder you are in. If you delete that folder , your data is gone.
此外,git 将所有内容存储在您所在文件夹的 .git 文件中。如果您删除该文件夹,您的数据就会消失。
回答by VonC
In a DVCS, a "workflow" means:
- merge workflow (which you shouldn't need that much in your case)
- publication workflow (push to a remote repo)
- 合并工作流程(在您的情况下您不需要那么多)
- 发布工作流程(推送到远程仓库)
With your local .git repo, you will be able to compare with previous versions (which can come in handy)
But the benefit of a DVCS is when:
使用您的本地 .git 存储库,您将能够与以前的版本进行比较(这可以派上用场)
但是 DVCS 的好处在于:
- you save your work through a push to a remote repo (or, for backup purposes, a bundle)
- you synchronize your work between two different PC (like in "How to push a local git repository to another computer?" or in "git server between laptop and PC (MS Windows 7)").
Then, once the sync is done (through agit push
), you can take your second environment completely off-line, and still benefit from the full history of your repo.
That is where a DVCS matters in your case.
- 您通过推送到远程存储库(或者,出于备份目的,捆绑)保存您的工作
- 您在两台不同的 PC 之间同步您的工作(如“如何将本地 git 存储库推送到另一台计算机?”或“笔记本电脑和 PC 之间的 git 服务器(MS Windows 7)”)。
然后,一旦同步完成(通过 agit push
),您就可以使您的第二个环境完全脱机,并且仍然可以从您的存储库的完整历史记录中受益。
这就是 DVCS 在您的情况下很重要的地方。