Python 可逆散列函数？

Question

提问by Stavros Korokithakis

I need a reversible hash function (obviously the input will be much smaller in size than the output) that maps the input to the output in a random-looking way. Basically, I want a way to transform a number like "123" to a larger number like "9874362483910978", but not in a way that will preserve comparisons, so it must not be always true that, if x1 > x2, f(x1) > f(x2) (but neither must it be always false).

我需要一个可逆散列函数（显然输入的大小比输出小得多）以随机方式将输入映射到输出。基本上，我想要一种将“123”之类的数字转换为“9874362483910978”之类的更大数字的方法，但不是以保留比较的方式，因此，如果 x1 > x2，f(x1 ) > f(x2) （但也不能总是假的）。

The use case for this is that I need to find a way to transform small numbers into larger, random-looking ones. They don't actually need to be random (in fact, they need to be deterministic, so the same input always maps to the same output), but they do need to lookrandom (at least when base64encoded into strings, so shifting by Z bits won't work as similar numbers will have similar MSBs).

这个用例是我需要找到一种方法将小数字转换为更大的、看起来随机的数字。它们实际上并不需要是随机的（实际上，它们需要是确定性的，因此相同的输入总是映射到相同的输出），但它们确实需要看起来是随机的（至少当 base64 编码为字符串时，因此移动 Z位将不起作用，因为相似的数字将具有相似的 MSB）。

Also, easy (fast) calculation and reversal is a plus, but not required.

此外，简单（快速）计算和反转是一个优点，但不是必需的。

I don't know if I'm being clear, or if such an algorithm exists, but I'd appreciate any and all help!

我不知道我是否清楚，或者是否存在这样的算法，但我很感激任何帮助！

Answer 1

采纳答案by Mike 'Pomax' Kamermans

None of the answers provided seemed particularly useful, given the question. I had the same problem, needing a simple, reversible hash for not-security purposes, and decided to go with bit relocation. It's simple, it's fast, and it doesn't require knowing anything about boolean maths or crypo algorithms or anything else that requires actual thinking.

鉴于问题，提供的答案似乎都没有特别有用。我遇到了同样的问题，出于非安全目的需要一个简单的可逆散列，并决定进行位重定位。它很简单，速度很快，而且不需要了解任何关于布尔数学或密码算法的知识，也不需要任何其他需要实际思考的知识。

The simplest would probably be to just move half the bits left, and the other half right:

最简单的方法可能是将一半位向左移动，另一半向右移动：

def hash(n):
  return ((0x0000FFFF & n)<<16) + ((0xFFFF0000 & n)>>16)

This is reversible, in that hash(hash(n)) = n, and has non-sequential pairs {n,m}, n < m, where hash(m) < hash(n).

这是可逆的，因为 hash(hash(n)) = n，并且具有非序列对 {n,m}，n < m，其中 hash(m) < hash(n)。

To get a less sequential looking implementation, you might also want to consider an interlace reordering from [msb,z,...,a,lsb] to [msb,lsb,z,a,...] or [lsb,msb,a,z,...] or any other relocation you feel gives an appropriately non-sequential sequence for the numbers you deal with.

为了获得较少顺序的实现，您可能还需要考虑从 [msb,z,...,a,lsb] 到 [msb,lsb,z,a,...] 或 [lsb,msb] 的交错重新排序,a,z,...] 或您认为的任何其他重定位为您处理的数字提供了适当的非连续序列。

(The above function is safe for numbers that fit in 32 bits, larger numbers are guaranteed to cause collisions and would need some more bit mask coverage to prevent problems. That said, 32 bits is usually enough for any non-security uid).

（上述函数对于适合 32 位的数字是安全的，更大的数字肯定会导致冲突，并且需要更多的位掩码覆盖以防止出现问题。也就是说，对于任何非安全 uid，32 位通常就足够了）。

Also have a look at the multiplicative inverseanswer given by Andy Hayden, below.

也看看下面安迪海登给出的乘法逆答案。

Answer 2

回答by Darbio

Basically, you are looking for 2 way encryption, and one that probably uses a salt.

基本上，您正在寻找 2 路加密，并且可能使用salt.

You have a number of choices:

您有多种选择：

TripleDES
AES

三重DES
AES

Here is an example:" Simple insecure two-way "obfuscation" for C#

这是一个例子：” C# 的简单不安全的双向“混淆”

What language are you looking at? If .NET then look at the encryption namespace for some ideas.

你在看什么语言？如果是 .NET，则查看加密命名空间以获取一些想法。

Answer 3

回答by Flipster

Why not just XOR with a nice long number?

为什么不只是与一个漂亮的长数字进行异或？

Easy. Fast. Reversible.

简单。快速地。可逆。

Or, if this doesn't need to be terribly secure, you could convert from base 10 to some smaller base (like base 8 or base 4, depending on how long you want the numbers to be).

或者，如果这不需要非常安全，您可以从基数 10 转换为更小的基数（如基数 8 或基数 4，具体取决于您希望数字的长度）。

Answer 4

回答by caf

What you are asking for isencryption. A block cipher in its basic mode of operation, ECB, reversibly maps a input block onto an output block of the same size. The input and output blocks can be interpreted as numbers.

您要求的是加密。处于基本操作模式 ECB 的分组密码可逆地将输入块映射到相同大小的输出块。输入和输出块可以解释为数字。

For example, AES is a 128 bit block cipher, so it maps an input 128 bit number onto an output 128 bit number. If 128 bits is good enough for your purposes, then you can simply pad your input number out to 128 bits, transform that single block with AES, then format the output as a 128 bit number.

例如，AES 是 128 位分组密码，因此它将输入 128 位数字映射到输出 128 位数字。如果 128 位足以满足您的目的，那么您可以简单地将输入数字填充为 128 位，使用 AES 转换该单个块，然后将输出格式化为 128 位数字。

If 128 bits is too large, you could use a 64 bit block cipher, like 3DES, IDEA or Blowfish.

如果 128 位太大，您可以使用 64 位分组密码，如 3DES、IDEA 或 Blowfish。

ECB mode is considered weak, but its weakness isthe constraint that you have postulated as a requirement (namely, that the mapping be "deterministic"). This is a weakness, because once an attacker has observed that 123 maps to 9874362483910978, from then on whenever she sees the latter number, she knows the plaintext was 123. An attacker can perform frequency analysis and/or build up a dictionary of known plaintext/ciphertext pairs.

ECB 模式被认为是弱的，但它的弱点是您假设为要求的约束（即映射是“确定性的”）。这是一个弱点，因为一旦攻击者观察到 123 映射到 9874362483910978，此后每当她看到后一个数字时，她就知道明文是 123。攻击者可以执行频率分析和/或建立已知明文字典/密文对。

Answer 5

回答by Andy Hayden

Another simple solution is to use multiplicative inverses (see Eri Clippert's blog):

另一个简单的解决方案是使用乘法逆（参见 Eri Clippert 的博客）：

we showed how you can take any two coprime positive integers x and m and compute a third positive integer y with the property that (x * y) % m == 1, and therefore that (x * z * y) % m == z % m for any positive integer z. That is, there always exists a “multiplicative inverse”, that “undoes” the results of multiplying by x modulo m.

我们展示了如何取任意两个互质正整数 x 和 m 并计算第三个正整数 y，其性质为 (x * y) % m == 1，因此 (x * z * y) % m == z % m 对于任何正整数 z。也就是说，总是存在一个“乘法逆”，它“撤销”乘以 x 模 m 的结果。

We take a large number e.g. 4000000000 and a large co-prime number e.g. 387420489:

我们取一个大数，例如 4000000000 和一个大的互质数，例如 387420489：

def rhash(n):
    return n * 387420489 % 4000000000

>>> rhash(12)
649045868

We first calculate the multiplicative inverse with modinvwhich turns out to be 3513180409:

我们首先计算乘法逆modinv，结果是 3513180409：

>>> 3513180409 * 387420489 % 4000000000
1

Now, we can define the inverse:

现在，我们可以定义逆：

def un_rhash(h):
    return h * 3513180409 % 4000000000

>>> un_rhash(649045868)  # un_rhash(rhash(12))
12

Note: This answer is fast to compute and works for numbers up to 4000000000, if you need to handle larger numbers choose a sufficiently large number (and another co-prime).

注意：这个答案计算速度很快，并且适用于高达 4000000000 的数字，如果您需要处理更大的数字，请选择一个足够大的数字（和另一个互质数）。

You may want to do this with hexidecimal (to pack the int):

您可能希望使用十六进制（打包整数）来执行此操作：

def rhash(n):
    return "%08x" % (n * 387420489 % 4000000000)

>>> rhash(12)
'26afa76c'

def un_rhash(h):
    return int(h, 16) * 3513180409 % 4000000000

>>> un_rhash('26afa76c')  # un_rhash(rhash(12))
12

If you choose a relatively large co-prime then this will seem random, be non-sequential and also be quick to calculate.

如果您选择一个相对较大的互质数，那么这将看起来是随机的，是非顺序的，并且计算速度也很快。

Python 可逆散列函数？

提问by Stavros Korokithakis

采纳答案by Mike 'Pomax' Kamermans

回答by Darbio

回答by Flipster

回答by caf

回答by Andy Hayden

相关推荐

最近更新

标签

Python 可逆散列函数？

提问by Stavros Korokithakis

采纳答案by Mike 'Pomax' Kamermans

回答by Darbio

回答by Flipster

回答by caf

回答by Andy Hayden

相关推荐

Python：最大/最小内置函数取决于参数顺序

从python中的unicode字符串获取字节

Python Flask 中的静态文件——robot.txt、sitemap.xml (mod_wsgi)

检查python类属性

相关推荐

最近更新

标签