为什么 Git 不使用更现代的 SHA?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28159071/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why doesn't Git use more modern SHA?
提问by qazwsx
I read about that Git uses SHA-1 digest as an ID for a revision. Why does it not use a more modern version of SHA?
我读到 Git 使用 SHA-1 摘要作为修订版的 ID。为什么它不使用更现代的 SHA 版本?
采纳答案by VonC
Why does it not use a more modern version of SHA?
为什么它不使用更现代的 SHA 版本?
Dec. 2017: It will. And Git 2.16 (Q1 2018) is the first release to illustrate and implement that intent.
2017 年 12 月:会的。Git 2.16(2018 年第一季度)是第一个说明和实现该意图的版本。
Note: see Git 2.19 below: it will be SHA-256.
注意:请参阅下面的 Git 2.19:它将是SHA-256。
Git 2.16 will propose an infrastructure to define what hash function is used in Git, and will start an effort to plumb that throughout various codepaths.
Git 2.16 将提出一个基础设施来定义 Git 中使用的哈希函数,并将开始努力在各种代码路径中探索它。
See commit c250e02(28 Nov 2017) by Ramsay Jones (``).
See commit eb0ccfd, commit 78a6766, commit f50e766, commit abade65(12 Nov 2017) by brian m. carlson (bk2204
).
(Merged by Junio C Hamano -- gitster
--in commit 721cc43, 13 Dec 2017)
请参阅Ramsay Jones (``) 的commit c250e02(28 Nov 2017 )。
请参阅brian m 的commit eb0ccfd、commit 78a6766、commit f50e766、commit abade65(2017 年 11 月 12 日)。卡尔森 ( bk2204
)。
(由Junio C gitster
Hamano合并-- --在提交 721cc43 中,2017 年 12 月 13 日)
Add structure representing hash algorithm
Since in the future we want to support an additional hash algorithm, add a structure that represents a hash algorithm and all the data that must go along with it.
Add a constant to allow easy enumeration of hash algorithms.
Implement functiontypedefs
to create an abstract API that can be used by any hash algorithm, and wrappers for the existing SHA1 functions that conform to this API.Expose a value for hex size as well as binary size.
While one will always be twice the other, the two values are both used extremely commonly throughout the codebase and providing both leads to improved readability.Don't include an entry in the hash algorithm structure for the null object ID.
As this value is all zeros, any suitably sized all-zero object ID can be used, and there's no need to store a given one on a per-hash basis.The current hash function transition plan envisions a time when we will accept input from the user that might be in SHA-1 or in the NewHash format.
Since we cannot know which the user has provided, add a constant representing the unknown algorithmto allow us to indicate that we must look the correct value up.
添加表示哈希算法的结构
由于将来我们希望支持额外的散列算法,因此添加一个表示散列算法的结构以及必须与它一起使用的所有数据。
添加一个常量以允许轻松枚举散列算法。
实现functiontypedefs
以创建可由任何哈希算法使用的抽象 API,以及符合此 API 的现有 SHA1 函数的包装器。公开hex size 和 binary size 的值。
虽然一个永远是另一个的两倍,但这两个值在整个代码库中都非常普遍地使用,并且提供这两个值可以提高可读性。不要在哈希算法结构中包含空对象 ID 的条目。
由于此值全为零,因此可以使用任何大小合适的全零对象 ID,并且无需在每个散列的基础上存储给定的对象 ID。当前的哈希函数转换计划设想了一个时间,我们将接受来自用户的输入,这些输入可能是 SHA-1 或 NewHash 格式。
由于我们无法知道用户提供的是哪个,因此添加一个代表未知算法的常量,以指示我们必须查找正确的值。
Integrate hash algorithm support with repo setup
In future versions of Git, we plan to support an additional hash algorithm.
Integrate the enumeration of hash algorithms with repository setup, and store a pointer to the enumerated data in struct repository.
Of course, we currently only support SHA-1, so hard-code this value inread_repository_format
.
In the future, we'll enumerate this value from the configuration.Add a constant,
the_hash_algo
, which points to thehash_algo
structure pointer in the repository global.
Note that this is the hash which is used to serialize data to disk, not the hash which is used to display items to the user.
The transition plan anticipates that these may be different.
We can add an additional element in the future (say,ui_hash_algo
) to provide for this case.
将哈希算法支持与 repo 设置集成
在 Git 的未来版本中,我们计划支持额外的哈希算法。
将哈希算法的枚举与存储库设置集成,并将指向枚举数据的指针存储在 struct repository 中。
当然,我们目前只支持 SHA-1,所以在read_repository_format
.
将来,我们将从配置中枚举此值。添加一个常量,
the_hash_algo
,它指向hash_algo
存储库全局中的结构指针。
请注意,这是用于将数据序列化到磁盘的散列,而不是用于向用户显示项目的散列。
过渡计划预计这些可能会有所不同。
我们可以在未来添加一个额外的元素(比如,ui_hash_algo
)来提供这种情况。
Update August 2018, for Git 2.19 (Q3 2018), Git seems to pick SHA-256as NewHash.
2018 年 8 月更新,对于 Git 2.19(2018 年第三季度),Git 似乎选择SHA-256作为 NewHash。
See commit 0ed8d8d(04 Aug 2018) by Jonathan Nieder (artagnon
).
See commit 13f5e09(25 Jul 2018) by ?var Arnfj?re Bjarmason (avar
).
(Merged by Junio C Hamano -- gitster
--in commit 34f2297, 20 Aug 2018)
请参阅Jonathan Nieder ( ) 的commit 0ed8d8d(04 Aug 2018 )。
请参阅?var Arnfj?re Bjarmason ( ) 的commit 13f5e09(2018 年 7 月 25 日)。(由Junio C Hamano合并-- --在commit 34f2297,2018 年 8 月 20 日)artagnon
avar
gitster
doc
hash-function-transition
: pick SHA-256 as NewHashFrom a security perspective, it seems that SHA-256, BLAKE2, SHA3-256, K12, and so on are all believed to have similar security properties.
All are good options from a security point of view.SHA-256 has a number of advantages:
It has been around for a while, is widely used, and is supported by just about every single crypto library (OpenSSL, mbedTLS, CryptoNG, SecureTransport, etc).
When you compare against SHA1DC, most vectorized SHA-256 implementations are indeed faster, even without acceleration.
If we're doing signatures with OpenPGP (or even, I suppose, CMS), we're going to be using SHA-2, so it doesn't make sense to have our security depend on two separate algorithms when either one of them alone could break the security when we could just depend on one.
So SHA-256 it is.
Update the hash-function-transition design doc to say so.After this patch, there are no remaining instances of the string "
NewHash
", except for an unrelated use from 2008 as a variable name int/t9700/test.pl
.
文档
hash-function-transition
:选择 SHA-256 作为 NewHash从安全角度来看,SHA-256、BLAKE2、SHA3-256、K12等似乎都被认为具有相似的安全属性。
从安全的角度来看,所有这些都是不错的选择。SHA-256 有许多优点:
它已经存在了一段时间,被广泛使用,并且几乎每个加密库(OpenSSL、mbedTLS、CryptoNG、SecureTransport 等)都支持它。
当您与 SHA1DC 进行比较时,即使没有加速,大多数矢量化 SHA-256 实现确实更快。
如果我们使用 OpenPGP(或者甚至,我想是 CMS)进行签名,我们将使用 SHA-2,因此我们的安全性依赖于两种不同的算法是没有意义的当我们只能依赖一个时,单独可能会破坏安全性。
所以 SHA-256 是。
更新哈希函数转换设计文档以说明这一点。这个补丁后,有串“的没有剩余的情况下,
NewHash
”除了2008年,从一个不相关的作为,在变量名t/t9700/test.pl
。
You can see this transition to SHA 256 in progress with Git 2.20 (Q4 2018):
您可以看到 Git 2.20(2018 年第四季度)正在向 SHA 256 过渡:
See commit 0d7c419, commit dda6346, commit eccb5a5, commit 93eb00f, commit d8a3a69, commit fbd0e37, commit f690b6b, commit 49d1660, commit 268babd, commit fa13080, commit 7b5e614, commit 58ce21b, commit 2f0c9e9, commit 825544a(15 Oct 2018) by brian m. carlson (bk2204
).
See commit 6afedba(15 Oct 2018) by SZEDER Gábor (szeder
).
(Merged by Junio C Hamano -- gitster
--in commit d829d49, 30 Oct 2018)
见提交0d7c419,提交dda6346,提交eccb5a5,提交93eb00f,提交d8a3a69,提交fbd0e37,提交f690b6b,提交49d1660,提交268babd,提交fa13080,提交7b5e614,提交58ce21b,提交2f0c9e9,提交825544a(2018年10月15日)由布赖恩米. 卡尔森 ( bk2204
)。
请参阅SZEDER Gábor ( ) 的commit 6afedba(2018 年 10 月 15 日)。(合并于szeder
Junio C gitster
Hamano -- --在d829d49 提交中,2018 年 10 月 30 日)
replace hard-coded constants
Replace several 40-based constants with references to
GIT_MAX_HEXSZ
orthe_hash_algo
, as appropriate.
Convert all uses of theGIT_SHA1_HEXSZ
to usethe_hash_algo
so that they are appropriate for any given hash length.
Instead of using a hard-coded constant for the size of a hex object ID, switch to use the computed pointer fromparse_oid_hex
that points after the parsed object ID.
替换硬编码常量
根据需要,将几个基于 40 的常量替换为对
GIT_MAX_HEXSZ
或 的 引用the_hash_algo
。
将 的所有使用转换GIT_SHA1_HEXSZ
为使用,the_hash_algo
以便它们适用于任何给定的散列长度。
不是使用硬编码常量作为十六进制对象 ID 的大小,而是切换到使用来自parse_oid_hex
解析对象 ID 之后的点的计算指针。
GIT_SHA1_HEXSZ
is further remove/replaced with Git 2.22 (Q2 2019) and commit d4e568b.
GIT_SHA1_HEXSZ
进一步删除/替换为 Git 2.22(2019 年第二季度)并提交 d4e568b。
That transition continues with Git 2.21 (Q1 2019), which adds sha-256 hash and plug it through the code to allow building Git with the "NewHash".
这种转变在 Git 2.21(2019 年第一季度)中继续进行,它添加了 sha-256 哈希并将其插入代码以允许使用“NewHash”构建 Git。
See commit 4b4e291, commit 27dc04c, commit 13eeedb, commit c166599, commit 37649b7, commit a2ce0a7, commit 50c817e, commit 9a3a0ff, commit 0dab712, commit 47edb64(14 Nov 2018), and commit 2f90b9d, commit 1ccf07c(22 Oct 2018) by brian m. carlson (bk2204
).
(Merged by Junio C Hamano -- gitster
--in commit 33e4ae9, 29 Jan 2019)
见提交4b4e291,提交27dc04c,提交13eeedb,提交c166599,提交37649b7,提交a2ce0a7,提交50c817e,提交9a3a0ff,提交0dab712,提交47edb64(2018年11月14日),以及提交2f90b9d,提交1ccf07c(2018年10月22日)由布赖恩米. 卡尔森 ( bk2204
)。
(由Junio C gitster
Hamano合并-- --在commit 33e4ae9,2019 年 1 月 29 日)
Add a base implementation of SHA-256 support (Feb. 2019)
SHA-1 is weak and we need to transition to a new hash function.
For some time, we have referred to this new function asNewHash
.
Recently, we decided to pick SHA-256 asNewHash
.
The reasons behind the choice of SHA-256 are outlined in this threadand in the commit history for the hash function transition document.Add a basic implementation of SHA-256 based off
libtomcrypt
, which is in the public domain.
Optimize it and restructure it to meet our coding standards.
Pull in the update and final functions from the SHA-1 block implementation, as we know these function correctly with all compilers. This implementation is slower than SHA-1, but more performant implementations will be introduced in future commits.Wire up SHA-256 in the list of hash algorithms, and add a test that the algorithm works correctly.
Note that with this patch, it is still not possible to switch to using SHA-256 in Git.
Additional patches are needed to prepare the code to handle a larger hash algorithm and further test fixes are needed.
hash
: add an SHA-256 implementation using OpenSSLWe already have OpenSSL routines available for SHA-1, so add routines for SHA-256 as well.
On a Core i7-6600U, this SHA-256 implementation compares favorably to the SHA1DC SHA-1 implementation:
SHA-1: 157 MiB/s (64 byte chunks); 337 MiB/s (16 KiB chunks) SHA-256: 165 MiB/s (64 byte chunks); 408 MiB/s (16 KiB chunks)
sha256
: add an SHA-256 implementation usinglibgcrypt
Generally, one gets better performance out of cryptographic routines written in assembly than C, and this is also true for SHA-256.
In addition, most Linux distributions cannot distribute Git linked against OpenSSL for licensing reasons.Most systems with GnuPG will also have
libgcrypt
, since it is a dependency of GnuPG.libgcrypt
is also faster than the SHA1DC implementation for messages of a few KiB and larger.For comparison, on a Core i7-6600U, this implementation processes 16 KiB chunks at 355 MiB/s while SHA1DC processes equivalent chunks at 337 MiB/s.
In addition, libgcrypt is licensed under the LGPL 2.1, which is compatible with the GPL. Add an implementation of SHA-256 that uses libgcrypt.
添加 SHA-256 支持的基本实现(2019 年 2 月)
SHA-1 很弱,我们需要过渡到新的哈希函数。
一段时间以来,我们将这个新函数称为NewHash
。
最近,我们决定选择 SHA-256 作为NewHash
.
此线程和哈希函数转换文档的提交历史中概述了选择 SHA-256 背后的原因。添加一个基于 SHA-256 的基本实现 off
libtomcrypt
,这是在公共领域。
对其进行优化和重组,以满足我们的编码标准。
从 SHA-1 块实现中提取更新和最终函数,因为我们对所有编译器都正确了解这些函数。此实现比 SHA-1 慢,但在未来的提交中将引入更高性能的实现。在哈希算法列表中连接 SHA-256,并添加一个测试该算法是否正常工作。
请注意,使用此补丁,仍然无法在 Git 中切换到使用 SHA-256。
需要额外的补丁来准备代码以处理更大的哈希算法,并且需要进一步的测试修复。
hash
: 添加一个使用 OpenSSL 的 SHA-256 实现我们已经有可用于 SHA-1 的 OpenSSL 例程,因此也为 SHA-256 添加例程。
在 Core i7-6600U 上,此 SHA-256 实现优于 SHA1DC SHA-1 实现:
SHA-1: 157 MiB/s (64 byte chunks); 337 MiB/s (16 KiB chunks) SHA-256: 165 MiB/s (64 byte chunks); 408 MiB/s (16 KiB chunks)
sha256
:使用添加 SHA-256 实现libgcrypt
通常,用汇编语言编写的加密例程比 C 获得更好的性能,对于 SHA-256 也是如此。
此外,出于许可原因,大多数 Linux 发行版无法分发与 OpenSSL 链接的 Git。大多数带有 GnuPG 的系统也会有
libgcrypt
,因为它是 GnuPG 的依赖项。libgcrypt
对于几 KiB 和更大的消息,它也比 SHA1DC 实现更快。相比之下,在 Core i7-6600U 上,此实现以 355 MiB/s 的速度处理 16 KiB 块,而 SHA1DC 以 337 MiB/s 的速度处理等效块。
此外,libgcrypt 在 LGPL 2.1 下获得许可,与 GPL 兼容。添加使用 libgcrypt 的 SHA-256 实现。
The upgrade effort goes on with Git 2.24 (Q4 2019)
Git 2.24(2019 年第四季度)继续进行升级工作
See commit aaa95df, commit be8e172, commit 3f34d70, commit fc06be3, commit 69fa337, commit 3a4d7aa, commit e0cb7cd, commit 8d4d86b, commit f6ca67d, commit dd336a5, commit 894c0f6, commit 4439c7a, commit 95518fa, commit e84f357, commit fe9fec4, commit 976ff7e, commit 703d2d4, commit 9d958cc, commit 7962e04, commit fee4930(18 Aug 2019) by brian m. carlson (bk2204
).
(Merged by Junio C Hamano -- gitster
--in commit 676278f, 11 Oct 2019)
见提交aaa95df,提交be8e172,提交3f34d70,提交fc06be3,提交69fa337,提交3a4d7aa,提交e0cb7cd,提交8d4d86b,提交f6ca67d,提交dd336a5,提交894c0f6,提交4439c7a,提交95518fa,提交e84f357,提交fe9fec4,提交976ff7e,提交703d2d4,提交9d958cc,提交7962e04,提交fee4930(2019 年 8 月 18 日)作者:brian m。卡尔森 ( bk2204
)。
(由Junio C gitster
Hamano合并-- --在提交 676278f 中,2019 年 10 月 11 日)
Instead of using
GIT_SHA1_HEXSZ
and hard-coded constants, switch to usingthe_hash_algo
.
而不是使用
GIT_SHA1_HEXSZ
和硬编码常量,切换到使用the_hash_algo
。
With Git 2.26 (Q1 2020), the test scriptsare ready for the day when the object names will use SHA-256.
在 Git 2.26(2020 年第一季度)中,测试脚本已准备好迎接对象名称将使用 SHA-256 的那一天。
See commit 277eb5a, commit 44b6c05, commit 7a868c5, commit 1b8f39f, commit a8c17e3, commit 8320722, commit 74ad99b, commit ba1be1a, commit cba472d, commit 82d5aeb, commit 3c5e65c, commit 235d3cd, commit 1d86c8f, commit 525a7f1, commit 7a1bcb2, commit cb78f4f, commit 717c939, commit 08a9dd8, commit 215b60b, commit 194264c(21 Dec 2019) by brian m. carlson (bk2204
).
(Merged by Junio C Hamano -- gitster
--in commit f52ab33, 05 Feb 2020)
见提交277eb5a,提交44b6c05,提交7a868c5,提交1b8f39f,提交a8c17e3,提交8320722,提交74ad99b,提交ba1be1a,提交cba472d,提交82d5aeb,提交3c5e65c,提交235d3cd,提交1d86c8f,提交525a7f1,提交7a1bcb2,提交cb78f4f,提交717c939,提交08a9dd8,提交215b60b,提交194264c(2019 年 12 月 21 日)作者:brian m。卡尔森 ( bk2204
)。
(由Junio C gitster
Hamano合并-- --在提交 f52ab33 中,2020 年 2 月 5 日)
Example:
例子:
t4204
: make hash size independentSigned-off-by: brian m. carlson
Use
$OID_REGEX
instead of a hard-coded regular expression.
t4204
: 使散列大小独立签字人:brian m. 卡尔森
使用
$OID_REGEX
而不是硬编码的正则表达式。
So, instead of using:
所以,而不是使用:
grep "^[a-f0-9]\{40\} $(git rev-parse HEAD)$" output
Tests are using
测试正在使用
grep "^$OID_REGEX $(git rev-parse HEAD)$" output
And OID_REGEX
comes from commit bdee9cd(13 May 2018) by brian m. carlson (bk2204
).
(Merged by Junio C Hamano -- gitster
--in commit 9472b13, 30 May 2018, Git v2.18.0-rc0)
而OID_REGEX
来自提交bdee9cd通过(2018年5月13日),布赖恩·米 卡尔森 ( bk2204
)。
(由Junio C gitster
Hamano合并-- --in commit 9472b13,2018 年 5 月 30 日,Git v2.18.0-rc0)
t/test-lib
: introduceOID_REGEX
Signed-off-by: brian m. carlson
Currently we have a variable,
$_x40,
which contains a regex that matches a full 40-character hex constant.However, with
NewHash
, we'll have object IDs that are longer than 40 characters.In such a case,
$_x40
will be a confusing name.Create a
$OID_REGEX
variable which will always reflect a regex matching the appropriate object ID, regardless of the length of the current hash.
t/test-lib
: 介绍OID_REGEX
签字人:brian m. 卡尔森
目前我们有一个变量,
$_x40,
它包含一个匹配完整 40 个字符的十六进制常量的正则表达式。但是,使用
NewHash
,我们将拥有超过 40 个字符的对象 ID。在这种情况下,
$_x40
将是一个令人困惑的名称。创建一个
$OID_REGEX
变量,该变量将始终反映匹配适当对象 ID 的正则表达式,而不管当前散列的长度如何。
And, still for tests:
而且,仍然用于测试:
See commit f303765, commit edf0424, commit 5db24dc, commit d341e08, commit 88ed241, commit 48c10cc, commit f7ae8e6, commit e70649b, commit a30f93b, commit a79eec2, commit 796d138, commit 417e45e, commit dfa5f53, commit f743e8f, commit 72f936b, commit 5df0f11, commit 07877f3, commit 6025e89, commit 7b1a182, commit 94db7e3, commit db12505(07 Feb 2020) by brian m. carlson (bk2204
).
(Merged by Junio C Hamano -- gitster
--in commit 5af345a, 17 Feb 2020)
见提交f303765,提交edf0424,提交5db24dc,提交d341e08,提交88ed241,提交48c10cc,提交f7ae8e6,提交e70649b,提交a30f93b,提交a79eec2,提交796d138,提交417e45e,提交dfa5f53,提交f743e8f,提交72f936b,提交5df0f11,提交07877f3,提交6025e89,提交7b1a182,提交94db7e3,由brian mbk2204
提交 db12505(2020 年 2 月 7 日)。卡尔森 ( )。
(由Junio C gitster
Hamano合并-- --在commit 5af345a,2020 年 2 月 17 日)
t5703
: make test work with SHA-256Signed-off-by: brian m. carlson
This test used an object ID which was 40 hex characters in length, causing the test not only not to pass, but to hang, when run with SHA-256 as the hash.
Change this value to a fixed dummy object ID using
test_oid_init
andtest_oid
.Furthermore, ensure we extract an object ID of the appropriate length using cut with fields instead of a fixed length.
t5703
:使用 SHA-256 进行测试签字人:brian m. 卡尔森
该测试使用了一个长度为 40 个十六进制字符的对象 ID,当使用 SHA-256 作为哈希运行时,导致测试不仅没有通过,而且挂起。
使用
test_oid_init
和将此值更改为固定的虚拟对象 IDtest_oid
。此外,确保我们使用带字段的剪切而不是固定长度来提取适当长度的对象 ID。
Some codepaths were given a repository instance as a parameter to work in the repository, but passed the_repository
instance to its callees, which has been cleaned up (somewhat) with Git 2.26 (Q1 2020).
一些代码路径被赋予一个存储库实例作为在存储库中工作的参数,但将the_repository
实例传递给它的被调用者,该被调用者已使用 Git 2.26(2020 年第一季度)进行了清理(在某种程度上)。
See commit b98d188, commit 2dcde20, commit 7ad5c44, commit c8123e7, commit 5ec9b8a, commit a651946, commit eb999b3(30 Jan 2020) by Matheus Tavares (matheustavares
).
(Merged by Junio C Hamano -- gitster
--in commit 78e67cd, 14 Feb 2020)
请参阅Matheus Tavares ( ) 的commit b98d188、commit 2dcde20、commit 7ad5c44、commit c8123e7、commit 5ec9b8a、commit a651946、commit eb999b3(2020 年 1 月 30 日)。(由Junio C Hamano合并-- --在commit 78e67cd,2020 年 2 月 14 日)matheustavares
gitster
sha1-file
: allowcheck_object_signature()
to handle any repoSigned-off-by: Matheus Tavares
Some callers of
check_object_signature()
can work on arbitrary repositories, but the repo does not get passed to this function. Instead,the_repository
is always used internally.
To fix possible inconsistencies, allow the function to receive a struct repository and make those callers pass on the repo being handled.
sha1-file
: 允许check_object_signature()
处理任何回购签字人:马修斯·塔瓦雷斯
的一些调用者
check_object_signature()
可以在任意存储库上工作,但存储库不会传递给此函数。相反,the_repository
始终在内部使用。
要修复可能的不一致,请允许该函数接收结构存储库并使这些调用者传递正在处理的存储库。
Based on:
基于:
sha1-file
: passgit_hash_algo
tohash_object_file()
Signed-off-by: Matheus Tavares
Allow
hash_object_file()
to work on arbitrary repos by introducing agit_hash_algo
parameter. Change callers which have a struct repository pointer in their scope to pass on thegit_hash_algo
from the said repo.
For all other callers, pass onthe_hash_algo
, which was already being used internally athash_object_file()
.
This functionality will be used in the following patch to makecheck_object_signature()
be able to work on arbitrary repos (which, in turn, will be used to fix an inconsistency atobject.c
:parse_object()).
sha1-file
:传递git_hash_algo
到hash_object_file()
签字人:马修斯·塔瓦雷斯
允许
hash_object_file()
通过引入git_hash_algo
参数来处理任意存储库。更改在其范围内具有结构存储库指针的调用者以从所述存储库传递git_hash_algo
。
对于所有其他调用者,传递the_hash_algo
,它已在 内部使用hash_object_file()
。
此功能将在以下补丁中使用,以使其check_object_signature()
能够处理任意存储库(反过来,它将用于修复object.c
:parse_object()处的不一致)。
回答by softwariness
UPDATE: The above question and this answer are from 2015. Since then Google have announced the first SHA-1 collision: https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html
更新:上面的问题和这个答案是从 2015 年开始的。从那时起谷歌宣布了第一次 SHA-1 冲突:https: //security.googleblog.com/2017/02/annoucing-first-sha1-collision.html
Obviously I can only speculate from the outside looking in about why Git continues to use SHA-1, but these may be among the reasons:
显然我只能从外部推测为什么 Git 继续使用 SHA-1,但这些可能是其中的原因:
- Git was Linus Torvald's creation, and Linus apparently does not want to substitute SHA-1 with another hashing algorithm at this time.
- He makes plausible claims that successful SHA-1 collision-based attacks against Git are a good deal harder than achieving the collisions themselves, and considering that SHA-1 is weaker than it should be, not completely broken, that makes it substantially far from a workable attack at least today. Moreover, he notes that a "successful" attack would achieve very little if the colliding object arrives later than the existing one, as the later one would just be assumed to be the same as the valid one and ignored (though others have pointed out that the reverse could occur).
- Changing software is time-consuming and error-prone especially when there is existing infrastructure and data based around the existing protocols that will have to be migrated. Even those who produce software and hardware products where cryptographic security is the sole point of the system are still in the process of migrating away from SHA-1 and other weak algorithms in places. Just imagine all those hardcoded
unsigned char[20]
buffers all over the place ;-), it's a lot easier to program for cryptographic agility at the start, rather than retrofitting it later. - Performance of SHA-1 is better than the various SHA-2 hashes (probably not by so much as to be a deal-breaker now, but maybe was a sticking point 10 years ago), and the storage size of SHA-2 is larger.
- Git 是 Linus Torvald 的创造,Linus 目前显然不想用另一种散列算法代替 SHA-1。
- 他提出了合理的说法,即成功的针对 Git 的基于 SHA-1 碰撞的攻击比实现碰撞本身要困难得多,并且考虑到 SHA-1 比它应有的弱,并没有完全被破坏,这使得它远非至少在今天可行的攻击。此外,他指出,如果碰撞对象晚于现有对象到达,则“成功”攻击将收效甚微,因为后者只会被假定为与有效对象相同并被忽略(尽管其他人指出可能会发生相反的情况)。
- 更改软件既费时又容易出错,尤其是在必须迁移基于现有协议的现有基础设施和数据时。即使是那些生产以加密安全为系统唯一重点的软件和硬件产品的公司,也仍然处于从 SHA-1 和其他弱算法迁移到某些地方的过程中。想象一下所有这些硬编码的
unsigned char[20]
缓冲区到处都是;-),在开始时为加密敏捷性编程要容易得多,而不是以后对其进行改造。 - SHA-1 的性能优于各种 SHA-2 散列(现在可能还没有成为交易破坏者,但可能是 10 年前的症结所在),并且 SHA-2 的存储大小更大.
Some links:
一些链接:
- Stackoverflow question on what would happen if a collision did occur in Git
- Newsgroup post showing a brief comment from Linus on the subject a couple of months after the main SHA-1 weakness became known in 2005
- A thread discussing the weakness and possible move to sha-256 (with replies from Linus) in 2006
- NIST statement on SHA-1 deprecation and recommending "to transition rapidly to the stronger SHA-2 family of hash functions"
- Stackoverflow 关于如果 Git 中确实发生碰撞会发生什么的问题
- 新闻组帖子显示了 Linus 在 2005 年主要 SHA-1 弱点已知几个月后对该主题的简短评论
- 2006 年讨论弱点和可能转向 sha-256 的线程(来自 Linus 的回复)
- NIST 关于 SHA-1 弃用的声明并建议“快速过渡到更强大的 SHA-2 哈希函数系列”
My personal view would be that whilst practical attacks are probably some time off, and even when they do occur people will probably initially mitigate against them with means other than changing the hash algorithm itself, that if you do care about security that you should be erring on the side of caution with your choices of algorithms, and continually revising upwards your security strengths, because the capabilities of attackers are also going only in one direction, so it would be unwise to take Git as a role model, especially as its purpose in using SHA-1 is not purporting to be cryptographic security.
我个人的观点是,虽然实际攻击可能需要一段时间,即使它们确实发生了,人们最初也可能会通过更改哈希算法本身以外的方式来缓解它们,如果您确实关心安全性,那么您应该犯错谨慎选择算法,并不断向上修正您的安全优势,因为攻击者的能力也只向一个方向发展,因此将 Git 作为榜样是不明智的,尤其是作为其目的使用 SHA-1 并不声称是加密安全。
回答by Arne Babenhauserheide
This is a discussion of the urgency of migrating away from SHA1 for Mercurial, but it applies to Git as well: https://www.mercurial-scm.org/wiki/mpm/SHA1
这是关于从 SHA1 迁移到 Mercurial 的紧迫性的讨论,但它也适用于 Git:https: //www.mercurial-scm.org/wiki/mpm/SHA1
In short: If you're not extremely dilligent today, you have much worse vulnerabilities than sha1. But despite that, Mercurial started over 10 years ago to prepare for migrating away from sha1.
简而言之:如果您今天不是非常勤奋,那么您的漏洞要比 sha1 严重得多。尽管如此,Mercurial 在 10 多年前就开始准备从 sha1 迁移。
work has been underway for years to retrofit Mercurial's data structures and protocols for SHA1's successors. Storage space was allocated for larger hashes in our revlog structure over 10 years ago in Mercurial 0.9 with the the introduction of RevlogNG. The bundle2 format introduced more recently supports the exchange of different hash types over the network. The only remaining pieces are choice of a replacement function and choosing a backwards-compatibility strategy.
多年来,为 SHA1 的后继者改造 Mercurial 的数据结构和协议的工作一直在进行中。10 多年前,随着 RevlogNG 的引入,在 Mercurial 0.9 中为我们的 revlog 结构中的较大哈希分配了存储空间。最近引入的 bundle2 格式支持通过网络交换不同的哈希类型。剩下的唯一部分是选择替换功能和选择向后兼容策略。
If git does not migrate away from sha1 before Mercurial does, you could always add another level of security by keeping a local Mercurial mirror with hg-git.
如果 git 没有在 Mercurial 之前从 sha1 迁移,你总是可以通过使用hg-git保留本地 Mercurial 镜像来增加另一个级别的安全性。
回答by Paul Wagland
There is now a transition planto a stronger hash, so it looks like in future it will use a more modern hash than SHA-1. From the current transition plan:
现在有一个向更强散列的过渡计划,因此看起来将来它将使用比 SHA-1 更现代的散列。从目前的过渡计划来看:
Some hashes under consideration are SHA-256, SHA-512/256, SHA-256x16, K12, and BLAKE2bp-256
正在考虑的一些哈希是 SHA-256、SHA-512/256、SHA-256x16、K12 和 BLAKE2bp-256