在 PostgreSQL 中将字符串散列为数值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9809381/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 23:23:20  来源:igfitidea点击:

Hashing a String to a Numeric Value in PostgreSQL

postgresqlplpgsqlpostgresql-8.4

提问by Salman A. Kagzi

I need to Convert Strings stored in my Database to a Numeric value. Result can be Integer (preferred) or Bigint. This conversion is to be done at Database side in a PL/pgSQL function.

我需要将存储在我的数据库中的字符串转换为数字值。结果可以是Integer(首选)或Bigint。此转换将在 PL/pgSQL 函数中的数据库端完成。

Can someone please point me to some algorithm or any API's that can be used to achieve this?

有人可以指点我一些算法或任何可用于实现此目的的 API 吗?

I have been searching for this on Google for hours now, could not find anything useful so far :(

我已经在谷歌上搜索了几个小时,到目前为止找不到任何有用的东西:(

回答by Daniel Vérité

Just keep the first 32 bits or 64 bits of the MD5 hash. Of course, it voids the main property of md5 (=the probability of collision being infinitesimal) but you'll still get a wide dispersion of values which presumably is good enough for your problem.

只保留 MD5 哈希的前 32 位或 64 位。当然,它使 md5 的主要属性无效(=碰撞概率无穷小),但您仍然会得到广泛的值分散,这可能足以解决您的问题。

SQL functions derived from the other answers:

从其他答案派生的 SQL 函数:

For bigint:

对于 bigint:

create function h_bigint(text) returns bigint as $$
 select ('x'||substr(md5(),1,16))::bit(64)::bigint;
$$ language sql;

For int:

对于整数:

create function h_int(text) returns int as $$
 select ('x'||substr(md5(),1,8))::bit(32)::int;
$$ language sql;

回答by a_horse_with_no_name

You can create a md5 hash value without problems:

您可以毫无问题地创建 md5 哈希值:

select md5('hello, world');

This returns a string with a hex number.

这将返回一个带有十六进制数字的字符串。

Unfortunately there is no built-in function to convert hex to integer but as you are doing that in PL/pgSQL anyway, this might help:

不幸的是,没有将十六进制转换为整数的内置函数,但是正如您在 PL/pgSQL 中所做的那样,这可能会有所帮助:

https://stackoverflow.com/a/8316731/330315

https://stackoverflow.com/a/8316731/330315

回答by dbenhur

Must it be an integer? The pg_cryptomodule provides a number of standard hash functions (md5, sha1, etc). They all return bytea. I suppose you could throw away some bits and convert bytea to integer.

它必须是整数吗?所述pg_crypto模块提供了许多标准的散列函数(MD5,SHA1,等等)。他们都返回 bytea。我想你可以扔掉一些位并将 bytea 转换为整数。

bigint is too small to store a cryptographic hash. The largest non-bytea binary type Pg supports is uuid. You could cast a digest to uuid like this:

bigint 太小,无法存储加密哈希。Pg 支持的最大的非字节二进制类型是 uuid。您可以像这样将摘要转换为 uuid:

select ('{'||encode( substring(digest('foobar','sha256') from 1 for 16), 'hex')||'}')::uuid;
                 uuid                 
--------------------------------------
 c3ab8ff1-3720-e8ad-9047-dd39466b3c89

回答by dvlcube

This is an implementation of Java's String.hashCode():

这是 Java 的一个实现String.hashCode()

CREATE OR REPLACE FUNCTION hashCode(_string text) RETURNS INTEGER AS $$
DECLARE
  val_ CHAR[];
  h_ INTEGER := 0;
  ascii_ INTEGER;
  c_ char;
BEGIN
  val_ = regexp_split_to_array(_string, '');

  FOR i in 1 .. array_length(val_, 1)
  LOOP
    c_ := (val_)[i];
    ascii_ := ascii(c_);
    h_ = 31 * h_ + ascii_;
    raise info '%: % = %', i, c_, h_;
  END LOOP;
RETURN h_;
END;
$$ LANGUAGE plpgsql;