C# 为字节数组或图像创建哈希

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/800463/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-05 01:41:45  来源:igfitidea点击:

C# Create a hash for a byte array or image

c#.netimagehash

提问by johnc

Possible Duplicate:
How do I generate a hashcode from a byte array in c#

可能的重复:
如何从 C# 中的字节数组生成哈希码

In C#, I need to create a Hash of an image to ensure it is unique in storage.

在 C# 中,我需要创建图像的 Hash 以确保它在存储中是唯一的。

I can easily convert it to a byte array, but unsure how to proceed from there.

我可以轻松地将其转换为字节数组,但不确定如何从那里进行。

Are there any classes in the .NET framework that can assist me, or is anyone aware of some efficientalgorithms to create such a unique hash?

.NET 框架中是否有任何类可以帮助我,或者是否有人知道一些有效的算法来创建这样一个独特的哈希?

采纳答案by Rex M

There's plenty of hashsum providers in .NET which create cryptographic hashes - which satisifies your condition that they are unique (for most purposes collision-proof). They are all extremelyfast and the hashing definitely won't be the bottleneck in your app unless you're doing it a trillion times over.

.NET 中有很多哈希和提供程序可以创建加密哈希 - 这满足您的条件,即它们是唯一的(对于大多数目的是防冲突的)。它们都非常快,除非你做一万亿次,否则哈希绝对不会成为你应用程序的瓶颈。

Personally I like SHA1:

我个人喜欢 SHA1:

string hash;
using(SHA1CryptoServiceProvider sha1 = new SHA1CryptoServiceProvider())
{
    hash = Convert.ToBase64String(sha1.ComputeHash(byteArray));
}

Even when people say one method might be slower than another, it's all in relative terms. A program dealing with images definitely won't notice the microsecond process of generating a hashsum.

即使人们说一种方法可能比另一种方法慢,这都是相对而言的。处理图像的程序肯定不会注意到生成哈希和的微秒过程。

And regarding collisions, for most purposes this is also irrelevant. Even "obsolete" methods like MD5 are still highly useful in most situations. Only recommend not using it when the security of your system relieson preventing collisions.

关于碰撞,对于大多数目的,这也是无关紧要的。即使像 MD5 这样的“过时”方法在大多数情况下仍然非常有用。仅当您的系统安全性依赖于防止冲突时,才建议不要使用它。

回答by Adam Robinson

You can use any of the standard hashing algorithms, but hashing can't technically guarantee uniqueness. Hashing is designed to be a relatively fast and/or small token to be able to see if one piece of data likely is the same as the other. It's fully possible for entirely different sets of data to produce the same hash, though being able to produce these algorithmically is very hard.

您可以使用任何标准散列算法,但散列在技术上不能保证唯一性。散列被设计为相对快速和/或较小的标记,以便能够查看一个数据是否可能与另一个相同。完全不同的数据集完全有可能产生相同的散列,尽管能够通过算法产生这些非常困难。

All of that aside, for checking likely identity, MD5 is fairly fast. SHA is more reliable (MD5 has been hacked, so shouldn't be use for security), but it's also slower.

除了所有这些,为了检查可能的身份,MD5 相当快。SHA 更可靠(MD5 已被黑客入侵,因此不应用于安全),但它也更慢。

回答by zvolkov

Creating new instance of SHA1CryptoServiceProvider every time you need to compute a hash is NOT fast at all. Using the same instance is pretty fast.

每次需要计算散列时都创建新的 SHA1CryptoServiceProvider 实例一点也不快。使用相同的实例非常快。

Still I'd rather do one of the many CRC algorithms instead of a cryptographic hash as hash functions designed for cryptography don't work too well for very small hash sizes (32 bit) which is what you want for your GetHash() override (assuming that's what you want).

我仍然宁愿使用许多 CRC 算法中的一种而不是加密哈希,因为为加密设计的哈希函数对于非常小的哈希大小(32 位)效果不佳,而这正是您想要的 GetHash() 覆盖(假设这就是你想要的)。

Check this link out for one example of computing CRC in C#: http://sanity-free.org/134/standard_crc_16_in_csharp.html

查看此链接以获取在 C# 中计算 CRC 的一个示例:http: //sanity-free.org/134/standard_crc_16_in_csharp.html

P.S. the reason you want your hash to be small (16 or 32 bit) is so you can compare them FAST (that was the whole point of having hashes, remember?). Having hash represented by a 256-bit long value encoded as string is pretty insane in terms of performance.

PS,您希望散列较小(16 位或 32 位)的原因是您可以快速比较它们(这就是拥有散列的全部意义,还记得吗?)。用 256 位长值表示的哈希编码为字符串在性能方面非常疯狂。

回答by Jonathan Rupp

The part of Rex M's answerabout using SHA1 to generate a hash is a good one (MD5 is also a popular option). zvolkov's suggestion about not constantly creating new crypto providers is also a good one (as is the suggestion about using CRC if speed is more important than virtually-guaranteed uniqueness.

Rex M 的关于使用 SHA1 生成哈希的部分回答很好(MD5 也是一种流行的选择)。zvolkov 关于不要不断创建新的加密提供程序的建议也是一个很好的建议(如果速度比几乎保证的唯一性更重要,那么关于使用 CRC 的建议也是如此。

However, do notuse Encoding.UTF8.GetString()to convert a byte[] into a string (unless of course you know from context that it is valid UTF8). For one, it will reject invalid surogates. A method guaranteed to always give you a valid string from a byte[] is Convert.ToBase64String().

不过,千万不能使用Encoding.UTF8.GetString()到一个byte []转换为字符串(当然,除非你从上下文中知道,它是有效的UTF8)。一方面,它将拒绝无效的代理人。保证始终从 byte[] 为您提供有效字符串的方法是Convert.ToBase64String()