将 Java jar 文件放入存储库(CVS、SVN ..)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4649015/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java jar files into a repository (CVS, SVN..)
提问by Neel
Why it's a bad idea to commit Java jar files into a repository (CVS, SVN..)
为什么将 Java jar 文件提交到存储库中是个坏主意(CVS、SVN ..)
回答by Darin Dimitrov
Because you can rebuild them from the source. On the hand if you are talking about third-party JAR files which are required by your project then it is a good idea to commit them into the repository so that the project is self-contained.
因为您可以从源头重建它们。另一方面,如果您正在谈论项目所需的第三方 JAR 文件,那么最好将它们提交到存储库中,以便项目自包含。
回答by Riduidel
So, you have a project that use some external dependencies. This dependencies are well known. They all have
因此,您有一个使用一些外部依赖项的项目。这种依赖性是众所周知的。他们都有
- A group (typically, the organization/forge creating them)
- An identifier (their name)
- A version
- 一个团体(通常是创建它们的组织/锻造厂)
- 标识符(他们的名字)
- 厌恶
In maven terminology, these informations are called the artifact (your Jar) coordinates.
在 Maven 术语中,这些信息称为工件(您的 Jar)坐标。
The dependencies I was talking about are either internal (for a web application, it can be your service/domain layer) or external (log4j, jdbc driver, Java EE framework, you name it, ...). All those dependencies (also called artifacts) are in fact, at their lowest level, binary files (JAR/WAR/EAR) that your CVS/SVN/GIT won't be able to store efficently. Indeed, SCM use the hypothesis that versionned content, the one for which diff operations are the most efficient) is text only. As a consequence, when binary data is stored, their is rarely storage optimization (contrary to text, where only versions differences are stored).
我所谈论的依赖关系要么是内部的(对于 Web 应用程序,它可以是您的服务/域层)要么是外部的(log4j、jdbc 驱动程序、Java EE 框架,等等)。实际上,所有这些依赖项(也称为工件)在其最低级别都是二进制文件 (JAR/WAR/EAR),您的 CVS/SVN/GIT 将无法有效地存储这些文件。事实上,SCM 使用版本化内容(差异操作最有效的内容)是纯文本的假设。因此,当存储二进制数据时,它们很少进行存储优化(与仅存储版本差异的文本相反)。
As a consequence, what I would tend to recommand you is to use a dependency management build system, like maven, Ivy, or Gradle. using such a tool, you will declare all your dependencies (in fact, in this file, you will declare your dependencies' artifacts coordinates) in a text (or maybe XML) file, which will be in your SCM. BUT your dependencies won't be in SCM. Rather, each developper will download them on its dev machine.
因此,我倾向于建议您使用依赖管理构建系统,例如maven、Ivy或Gradle。使用这样的工具,您将在 SCM 中的文本(或 XML)文件中声明所有依赖项(实际上,在此文件中,您将声明依赖项的工件坐标)。但是您的依赖项不会在 SCM 中。相反,每个开发人员将在其开发机器上下载它们。
This transfers some network load from the SCM server to the internet (which bandwidth is often more limitated than internal enterpise network), and asks the question of long-term availability of artifacts. Both of these answers are solved (at least in amven work, but I believe both Ivy and gradle are able to connect to such tools - and it seems some questions are been asked on this very subject) using enterprises proxies, like Nexus, Artifactoryand others.
这将一些网络负载从 SCM 服务器传输到 Internet(其带宽通常比内部企业网络更受限制),并询问工件的长期可用性问题。这两个答案都解决了(至少在 amven 工作中,但我相信 Ivy 和 gradle 都能够连接到这些工具 - 似乎在这个主题上有人问了一些问题)使用企业代理,如Nexus,Artifactory和其他。
The beauty of these tools is that they make available in internal network a view of all required artifacts, going as far as allowing you to deploy your own artifacts in these repositories, making sharing of your code both easy and independant from the source (which may be an advantage).
这些工具的美妙之处在于它们在内部网络中提供了所有必需工件的视图,甚至允许您在这些存储库中部署自己的工件,使您的代码共享既简单又独立于源(可能成为优势)。
To sum up this long reply : use Ivy/Maven/Gradle instead of simple Ant build. These tools will allow you to define your dependencies, and do all the work of downloading these dependencies and ensuring you use the declared version.
总结一下这个长回复:使用 Ivy/Maven/Gradle 而不是简单的 Ant 构建。这些工具将允许您定义您的依赖项,并完成下载这些依赖项并确保您使用声明的版本的所有工作。
On a personnal note, the day I discovered those tools, my vision of dependency handling in Java get from nightmare to heaven, as I now only have to say that I use this very version of this tool, and maven (in my case), do all the background job of downloading it and storing at the right location on my computer.
就我个人而言,在我发现这些工具的那天,我对 Java 中的依赖处理的愿景从噩梦变成了天堂,因为我现在只需要说我使用这个工具的这个版本,以及 maven(在我的例子中),完成下载它并存储在我计算机上正确位置的所有后台工作。
回答by Mark
Source control systems are designed for holding the text source code. They canhold binary files, but that isn't really what they are designed for. In some cases it makes sense to put a binary file in source control, but java dependencies are generally better managed in a different way.
源代码控制系统设计用于保存文本源代码。它们可以保存二进制文件,但这并不是它们真正的设计目的。在某些情况下,将二进制文件置于源代码管理中是有意义的,但通常以不同的方式更好地管理 Java 依赖项。
The ideal setup is one that lets you manage your dependencies outside of source control. You should be able to manage your dependencies outside of the source and simply "point" to the desired dependency from within the source. This has several advantages:
理想的设置是允许您在源代码控制之外管理您的依赖项。您应该能够在源之外管理您的依赖项,并简单地从源内“指向”所需的依赖项。这有几个优点:
- You can have a number of projects dependent on the same binaries without keeping a separate copy of each binary. It is common for a medium sized project to have hundreds of binaries it depends on. This can result in a great deal of duplication which wastes local and backup resources.
- Versions of binaries can be managed centrally within your local environment or within the corporate entity.
- In many situations, the source control server is not a local resource. Adding a bunch of binary files will slow things down because it increases the amount of data that needs to be sent across a slower connection.
- If you are creating a war, there may be some jars you need for development, but not deployment and vice versa. A good dependency management tool lets you handle these types of issues easily and efficiently.
- If you are depending on a binary file that comes from another one of your projects, it may change frequently. This means you could be constantly overwriting the binary with a new version. Since version control is going to keep every copy, it could quickly grow to an unmanageable size--particularly if you have any type of continuous integration or automated build scripts creating these binaries.
- A dependency management system offers a certain level of flexibility in how you depend on binaries. For example, on your local machine, you may want to depend on the latest version of a dependency as it sits on your file system. However, when you deploy your application you want the dependency packaged as a jar and included in your file.
- 您可以拥有多个依赖于相同二进制文件的项目,而无需保留每个二进制文件的单独副本。一个中型项目通常有数百个它所依赖的二进制文件。这会导致大量重复,从而浪费本地和备份资源。
- 可以在本地环境或公司实体内集中管理二进制文件的版本。
- 在许多情况下,源代码管理服务器不是本地资源。添加一堆二进制文件会减慢速度,因为它会增加需要通过较慢连接发送的数据量。
- 如果您正在创建War,则可能需要一些 jar 来进行开发,但不需要部署,反之亦然。一个好的依赖管理工具可以让您轻松有效地处理这些类型的问题。
- 如果您依赖来自另一个项目的二进制文件,它可能会经常更改。这意味着您可能会不断用新版本覆盖二进制文件。由于版本控制将保留每个副本,它可能会迅速增长到无法管理的大小——特别是如果您有任何类型的持续集成或创建这些二进制文件的自动构建脚本。
- 依赖管理系统在您如何依赖二进制文件方面提供了一定程度的灵活性。例如,在您的本地机器上,您可能希望依赖最新版本的依赖项,因为它位于您的文件系统上。但是,当您部署应用程序时,您希望将依赖项打包为 jar 并包含在您的文件中。
Maven's dependency management features solve these issues for you and can help you locate and retrieve binary dependencies as needed. Ivy is another tool that does this as well, but for Ant.
Maven 的依赖项管理功能为您解决了这些问题,并可以帮助您根据需要定位和检索二进制依赖项。Ivy 是另一个也可以做到这一点的工具,但适用于 Ant。
回答by vdboor
They are binary files:
它们是二进制文件:
- It's better to reference the source, since that's what you're using source controlfor.
- The system can't tell you which differences between the files
- They become a source of merge-conflicts, in case they are compiled from the source in the same repository.
Some systems (e.g. SVN) don't deal quite well with large binary files.
- 最好参考源代码,因为这是您使用源代码管理的目的。
- 系统无法告诉您文件之间的差异
- 它们成为合并冲突的来源,以防它们是从同一存储库中的源代码编译的。
某些系统(例如 SVN)不能很好地处理大型二进制文件。
In other words, better reference the source, and adjust your build scripts to make everything work.
换句话说,最好参考源代码,并调整您的构建脚本以使一切正常。
回答by Kevin Stembridge
The decision to commit jar files to SCM is usually influenced by the build tool being used. If using Maven in a conventional manner then you don't really have the choice. But if your build system allows you the choice, I think it is a good idea to commit your dependencies to SCM alongside the source code that depends on them.
将 jar 文件提交给 SCM 的决定通常受所使用的构建工具的影响。如果以传统方式使用 Maven,那么您真的没有选择。但是,如果您的构建系统允许您进行选择,我认为将您的依赖项与依赖它们的源代码一起提交给 SCM 是个好主意。
This applies to third-party jars and in-house jars that are on a separate release cycle to your project. For example, if you have an in-house jar file containing common utility classes, I would commit that to SCM under each project that uses it.
这适用于与您的项目处于单独发布周期的第三方 jar 和内部 jar。例如,如果您有一个包含通用实用程序类的内部 jar 文件,我会将其提交给每个使用它的项目下的 SCM。
If using CVS, be aware that it does not handle binary files efficiently. An SVN repository makes no distinction between binary and text files.
如果使用 CVS,请注意它不能有效地处理二进制文件。SVN 存储库不区分二进制文件和文本文件。
http://svnbook.red-bean.com/en/1.5/svn.forcvs.binary-and-trans.html
http://svnbook.red-bean.com/en/1.5/svn.forcvs.binary-and-trans.html
Update in response to the answer posted by Mark:
更新以回应 Mark 发布的答案:
WRT bullet point 1: I would say it is not very common for even a large project to have hundreds of dependencies. In any case, disk usage (by keeping a separate copy of a dependency in each project that uses it) should not be your major concern. Disk space is cheap compared with the amount of time lost dealing with the complexities of a Maven repository. In any case, a local Maven repository will consume far more disk space than just the dependencies you actually use.
WRT 要点 1:我想说,即使是一个大型项目,拥有数百个依赖项的情况也不是很常见。在任何情况下,磁盘使用(通过在使用它的每个项目中保留一个单独的依赖项副本)不应该是您的主要关注点。与处理 Maven 存储库的复杂性所花费的时间相比,磁盘空间是便宜的。在任何情况下,本地 Maven 存储库都会消耗比您实际使用的依赖项更多的磁盘空间。
Bullet 3: Maven will not save you time waiting for network traffic. The opposite is true. With your dependencies in source control, you do a checkout, then you switch from one branch to another. You will very rarely need to checkout the same jars again. If you do, it will take only minutes. The main reason Maven is a slow build tool is all the network access it does even when there is no need.
要点 3:Maven 不会为您节省等待网络流量的时间。事实正好相反。使用源代码管理中的依赖项,您进行检出,然后从一个分支切换到另一个分支。您很少需要再次检查相同的罐子。如果你这样做,只需要几分钟。Maven 是一个缓慢的构建工具的主要原因是它即使在不需要时也能访问所有网络。
Bullet Point 4: Your point here is not an argument against storing jars in SCM and Maven is only easy once you have learned it and it is only efficient up to the point when something goes wrong. Then it becomes difficult and your efficiency gains can disappear quickly. In terms of efficiency, Maven has a small upside when things work correctly and a big downside when they don't.
要点 4:您的观点不是反对在 SCM 中存储 jars 的论点,只有在您学会了 Maven 后,它才会变得简单,并且只有在出现问题时才有效。然后就变得很困难,你的效率提升会很快消失。在效率方面,当事情正常运行时,Maven 有一个小的好处,而当它们不能正常工作时,则有一个大的缺点。
Bullet Point 5: Version control systems like SVN do not keep a separate copy of every version of every file. It stores them efficiently as deltas. It is very unlikely that your SVN repository will grow to an 'unmanageable' size.
要点 5:像 SVN 这样的版本控制系统不会保留每个文件的每个版本的单独副本。它将它们有效地存储为增量。您的 SVN 存储库不太可能增长到“无法管理”的大小。
Bullet Point 6: Your point here is not an argument against storing files is SCM. The use case you mention can be handled just as easily by a custom Ant build.
要点 6:您在这里的观点不是反对存储文件是 SCM。您提到的用例可以通过自定义 Ant 构建轻松处理。