使用 GIT 管理文档
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4655533/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Managing documents using GIT
提问by Anush Shetty
I am working on a website where I will be able to create project and upload data to each of my products. The data could be mostly in the form of spreadsheet docs, images, pdfs etc. Ideally, I would like to use a VCS (git pref) kind of setup where each time I update a particular document, I could just commit that document to a repo. Any ideas on how I could go about implementing will be helpful.
我正在一个网站上工作,我将能够在其中创建项目并将数据上传到我的每个产品。数据可能主要是电子表格文档、图像、pdf 等形式。理想情况下,我想使用 VCS(git pref)类型的设置,每次更新特定文档时,我只需将该文档提交到回购。关于我如何实施的任何想法都会有所帮助。
回答by cezio
You can call git in a subshell after each upload.
您可以在每次上传后在子 shell 中调用 git。
But I don't think using any VCS it's good solution for document versioning, especially in web application. This is because with office-like documents you will use mostly binary data. VCS sucks (no exceptions) when comes to binary data. You will not be able to do any diff, and metadata management is not suited for such things - author of commit is mostly bounded to particular account (and you will be using probably one system account for git), no additional information (except base file information: size, permissions, ctime) is stored, so you will have to store it (authorship, permissions for web application users, additional meta-data) some near by by yourself. Also note that several users can commit data at the same time, so there will be branches in your versioning. When you will have huge dataset (and with binary office files it can come quicker than you think), you will not be able to partition such repository.
但我不认为使用任何 VCS 是文档版本控制的好解决方案,尤其是在 Web 应用程序中。这是因为对于类似办公室的文档,您将主要使用二进制数据。当涉及到二进制数据时,VCS 很糟糕(没有例外)。您将无法做任何差异,并且元数据管理不适合此类事情 - 提交的作者主要受限于特定帐户(并且您可能将使用一个系统帐户用于 git),没有其他信息(基本文件除外)信息:大小、权限、ctime)已存储,因此您必须将其(作者身份、Web 应用程序用户的权限、附加元数据)存储在自己附近。另请注意,多个用户可以同时提交数据,因此您的版本控制中会有分支。
IMO, using VCS here gives you very small gain and introduces additional problems.
IMO,在这里使用 VCS 会给您带来非常小的收益并引入其他问题。
I'd advice keeping metadata in database (file name, revisions, additional stuff), and keep file revisions on disk. Keep each file with revisions in separate, unique dir. One tip here: don't use file names that comes from upload. Use hash functions to calculate unique name based on content and metadata.
我建议将元数据保存在数据库中(文件名、修订、其他内容),并将文件修订保存在磁盘上。将每个带有修订版本的文件保存在单独的、唯一的目录中。这里有一个提示:不要使用来自上传的文件名。使用哈希函数根据内容和元数据计算唯一名称。
回答by VonC
There isn't an universal "commit on save" feature (at least one integrated with all the editors associated with the document types you mention)
没有通用的“保存时提交”功能(至少有一个与您提到的文档类型相关的所有编辑器集成)
The easiest way would be a background job which would commit (or 'git add -A && git commit -m "xxx"
in the case of Git) every 5 minutes for instance.
最简单的方法是后台作业,git add -A && git commit -m "xxx"
例如每 5 分钟提交一次(或 '在 Git 的情况下)。
Actually, Mark Longaircomments:
实际上,Mark Longair评论道:
flashbakeis designed to be run from cron to do what you describe in the second paragraph with some kind of reasonable commit message.
I'm not sure that that's what the original poster is after, though.
flashbake旨在从 cron 运行,以使用某种合理的提交消息执行您在第二段中描述的操作。
不过,我不确定这就是原始海报所追求的。
- Automated backup is nice unless you have files for which you want to view an incremental history.
- Source control is great for that history but most tools expect the author to manually commit their changes along the way.
- => A seamless source control solution combines the convenience of automated back up with the power of source version control.
- 除非您有要查看其增量历史记录的文件,否则自动备份很好。
- 源代码控制非常适合这段历史,但大多数工具都希望作者在此过程中手动提交更改。
- => 无缝源代码控制解决方案结合了自动化备份的便利性和源代码版本控制的强大功能。
回答by Sean Allred
As a branch off of Cezio's answer, if you would really like to use a VCS for version control, consider LaTeX. Since it is essentially source code that is compiled into a document (usually PDF via pdflatex
), it's a reasonable candidate for version control.
作为 Cezio 答案的一个分支,如果您真的想使用 VCS 进行版本控制,请考虑使用 LaTeX。由于它本质上是编译成文档(通常是 PDF pdflatex
)的源代码,因此它是版本控制的合理候选者。