postgresql 如何在PostgreSQL中创建一个适合会话ID的随机字符串?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3970795/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 22:42:18  来源:igfitidea点击:

How do you create a random string that's suitable for a session ID in PostgreSQL?

postgresqlrandom

提问by gersh

I'd like to make a random string for use in session verification using PostgreSQL. I know I can get a random number with SELECT random(), so I tried SELECT md5(random()), but that doesn't work. How can I do this?

我想制作一个随机字符串,用于使用 PostgreSQL 的会话验证。我知道我可以用 得到一个随机数SELECT random(),所以我尝试了SELECT md5(random()),但这不起作用。我怎样才能做到这一点?

采纳答案by Szymon Lipiński

I'd suggest this simple solution:

我建议这个简单的解决方案:

This is a quite simple function that returns a random string of the given length:

这是一个非常简单的函数,它返回给定长度的随机字符串:

Create or replace function random_string(length integer) returns text as
$$
declare
  chars text[] := '{0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z}';
  result text := '';
  i integer := 0;
begin
  if length < 0 then
    raise exception 'Given length cannot be less than 0';
  end if;
  for i in 1..length loop
    result := result || chars[1+random()*(array_length(chars, 1)-1)];
  end loop;
  return result;
end;
$$ language plpgsql;

And the usage:

以及用法:

select random_string(15);

Example output:

示例输出:

select random_string(15) from generate_series(1,15);

  random_string
-----------------
 5emZKMYUB9C2vT6
 3i4JfnKraWduR0J
 R5xEfIZEllNynJR
 tMAxfql0iMWMIxM
 aPSYd7pDLcyibl2
 3fPDd54P5llb84Z
 VeywDb53oQfn9GZ
 BJGaXtfaIkN4NV8
 w1mvxzX33NTiBby
 knI1Opt4QDonHCJ
 P9KC5IBcLE0owBQ
 vvEEwc4qfV4VJLg
 ckpwwuG8YbMYQJi
 rFf6TchXTO3XsLs
 axdQvaLBitm6SDP
(15 rows)

回答by Peter Eisentraut

You can fix your initial attempt like this:

您可以像这样修复您的初始尝试:

SELECT md5(random()::text);

Much simpler than some of the other suggestions. :-)

比其他一些建议简单得多。:-)

回答by grourk

Building on Marcin's solution, you could do this to use an arbitrary alphabet (in this case, all 62 ASCII alphanumeric characters):

基于 Marcin 的解决方案,您可以使用任意字母表(在本例中为所有 62 个 ASCII 字母数字字符):

SELECT array_to_string(array 
       ( 
              select substr('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', trunc(random() * 62)::integer + 1, 1)
              FROM   generate_series(1, 12)), '');

回答by Evan Carroll

You can get 128 bits of random from a UUID. This is the method to get the job done in modern PostgreSQL.

您可以从 UUID 中获得 128 位随机数。这是在现代 PostgreSQL 中完成工作的方法。

CREATE EXTENSION pgcrypto;
SELECT gen_random_uuid();

           gen_random_uuid            
--------------------------------------
 202ed325-b8b1-477f-8494-02475973a28f

May be worth reading the docs on UUID too

可能也值得阅读 UUID 上的文档

The data type uuid stores Universally Unique Identifiers (UUID) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards. (Some systems refer to this data type as a globally unique identifier, or GUID, instead.) This identifier is a 128-bit quantitythat is generated by an algorithm chosen to make it very unlikely that the same identifier will be generated by anyone else in the known universe using the same algorithm. Therefore, for distributed systems, these identifiers provide a better uniqueness guarantee than sequence generators, which are only unique within a single database.

数据类型 uuid 存储由RFC 4122、ISO/IEC 9834-8:2005和相关标准定义的通用唯一标识符 (UUID) 。(有些系统将此数据类型称为全局唯一标识符或 GUID。)此标识符是一个128 位的数量,由选定的算法生成,以确保其他任何人都不太可能生成相同的标识符在已知宇宙中使用相同的算法。因此,对于分布式系统,这些标识符提供了比序列生成器更好的唯一性保证,序列生成器仅在单个数据库中是唯一的。

How rare is a collision with UUID, or guessable? Assuming they're random,

与 UUID 的冲突有多罕见,或可猜测?假设它们是随机的,

About 100 trillion version 4 UUIDs would need to be generated to have a 1 in a billion chance of a single duplicate ("collision"). The chance of one collision rises to 50% only after 261 UUIDs (2.3 x 10^18 or 2.3 quintillion) have been generated. Relating these numbers to databases, and considering the issue of whether the probability of a Version 4 UUID collision is negligible, consider a file containing 2.3 quintillion Version 4 UUIDs, with a 50% chance of containing one UUID collision. It would be 36 exabytes in size, assuming no other data or overhead, thousands of times larger than the largest databases currently in existence, which are on the order of petabytes. At the rate of 1 billion UUIDs generated per second, it would take 73 years to generate the UUIDs for the file. It would also require about 3.6 million 10-terabyte hard drives or tape cartridges to store it, assuming no backups or redundancy. Reading the file at a typical "disk-to-buffer" transfer rate of 1 gigabit per second would require over 3000 years for a single processor. Since the unrecoverable read error rate of drives is 1 bit per 1018 bits read, at best, while the file would contain about 1020 bits, just reading the file once from end to end would result, at least, in about 100 times more mis-read UUIDs than duplicates. Storage, network, power, and other hardware and software errors would undoubtedly be thousands of times more frequent than UUID duplication problems.

大约需要生成 100 万亿个第 4 版 UUID,才能有十亿分之一的机会出现单个重复(“冲突”)。只有在生成 261 个 UUID(2.3 x 10^18 或 2.3 quintillion)后,一次碰撞的几率才会上升到 50%。将这些数字与数据库相关联,并考虑版本 4 UUID 冲突的概率是否可以忽略的问题,请考虑包含 2.3 quintillion 版本 4 UUID 的文件,其中包含一个 UUID 冲突的可能性为 50%。假设没有其他数据或开销,它的大小将是 36 艾字节,比目前存在的最大数据库(PB 级)大数千倍。以每秒生成 10 亿个 UUID 的速度,为文件生成 UUID 需要 73 年。它还需要大约 3。600 万个 10 TB 硬盘驱动器或磁带盒来存储它,假设没有备份或冗余。对于单个处理器,以每秒 1 吉比特的典型“磁盘到缓冲区”传输速率读取文件需要 3000 多年。由于驱动器不可恢复的读取错误率是每 1018 位读取 1 位,而文件将包含大约 1020 位,仅从头到尾读取一次文件至少会导致大约 100 倍的错误-读取 UUID 而不是重复。存储、网络、电源和其他硬件和软件错误无疑比 UUID 重复问题频繁数千倍。对于单个处理器而言,每秒 1 吉比特的传输速率将需要 3000 多年的时间。由于驱动器不可恢复的读取错误率是每 1018 位读取 1 位,而文件将包含大约 1020 位,仅从头到尾读取一次文件至少会导致大约 100 倍的错误-读取 UUID 而不是重复。存储、网络、电源和其他硬件和软件错误无疑比 UUID 重复问题频繁数千倍。对于单个处理器而言,每秒 1 吉比特的传输速率将需要 3000 多年的时间。由于驱动器不可恢复的读取错误率是每 1018 位读取 1 位,而文件将包含大约 1020 位,仅从头到尾读取一次文件至少会导致大约 100 倍的错误-读取 UUID 而不是重复。存储、网络、电源和其他硬件和软件错误无疑比 UUID 重复问题频繁数千倍。

source: wikipedia

来源:维基百科

In summary,

总之,

  • UUID is standardized.
  • gen_random_uuid()is 128 bits of random stored in 128 bits (2**128 combinations). 0-waste.
  • random()only generates 52 bits of random in PostgreSQL (2**52 combinations).
  • md5()stored as UUID is 128 bits, but it can only be as random as its input (52 bits if using random())
  • md5()stored as text is 288 bits, but it only can only be as random as its input (52 bits if using random()) - over twice the size of a UUID and a fraction of the randomness)
  • md5()as a hash, can be so optimized that it doesn't effectively do much.
  • UUID is highly efficient for storage: PostgreSQL provides a type that is exactly 128 bits. Unlike textand varchar, etc which store as a varlenawhich has overhead for the length of the string.
  • PostgreSQL nifty UUID comes with some default operators, castings, and features.
  • UUID 是标准化的。
  • gen_random_uuid()是 128 位随机存储在 128 位(2**128 组合)。0-浪费。
  • random()只在 PostgreSQL 中生成 52 位随机数(2**52 组合)。
  • md5()存储为 UUID 是 128 位,但它只能与输入一样随机(如果使用,则为 52 位random()
  • md5()存储为文本是 288 位,但它只能与输入一样随机(如果使用,则为 52 位random()) - 超过 UUID 大小的两倍和随机性的一小部分)
  • md5()作为散列,可以如此优化,以至于它不会有效地做很多事情。
  • UUID 的存储效率很高:PostgreSQL 提供了一个正好是 128 位的类型。与textandvarchar等不同,它存储为 a varlena,它对字符串的长度有开销。
  • PostgreSQL 漂亮的 UUID 带有一些默认的运算符、转换和功能。

回答by Marcin Raczkowski

I was playing with PostgreSQL recently, and I think I've found a little better solution, using only built-in PostgreSQL methods - no pl/pgsql. The only limitation is it currently generates only UPCASE strings, or numbers, or lower case strings.

我最近在玩 PostgreSQL,我想我找到了一个更好的解决方案,只使用内置的 PostgreSQL 方法 - 没有 pl/pgsql。唯一的限制是它目前只生成 UPCASE 字符串、数字或小写字符串。

template1=> SELECT array_to_string(ARRAY(SELECT chr((65 + round(random() * 25)) :: integer) FROM generate_series(1,12)), '');
 array_to_string
-----------------
 TFBEGODDVTDM

template1=> SELECT array_to_string(ARRAY(SELECT chr((48 + round(random() * 9)) :: integer) FROM generate_series(1,12)), '');
 array_to_string
-----------------
 868778103681

The second argument to the generate_seriesmethod dictates the length of the string.

generate_series方法的第二个参数规定了字符串的长度。

回答by Andrew Wolfe

Please use string_agg!

请使用string_agg

SELECT string_agg (substr('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', ceil (random() * 62)::integer, 1), '')
FROM   generate_series(1, 45);

I'm using this with MD5 to generate a UUID also. I just want a random value with more bits than a random ()integer.

我也在 MD5 中使用它来生成 UUID。我只想要一个位数多于random ()整数的随机值。

回答by Jefferey Cave

While not active by default, you could activate one of the core extensions:

虽然默认情况下不活动,但您可以激活核心扩展之一:

CREATE EXTENSION IF NOT EXISTS pgcrypto;

Then your statement becomes a simple call to gen_salt() which generates a random string:

然后你的语句变成对 gen_salt() 的简单调用,它生成一个随机字符串:

select gen_salt('md5') from generate_series(1,4);

 gen_salt
-----------
$M.QRlF4U
$cv7bNJDM
$av34779p
$ZQkrCXHD

The leading number is a hash identifier. Several algorithms are available each with their own identifier:

前导数字是哈希标识符。有几种算法可用,每种算法都有自己的标识符:

  • md5: $1$
  • bf: $2a$06$
  • des: no identifier
  • xdes: _J9..
  • md5:$1$
  • 男朋友:$2a$06$
  • des: 没有标识符
  • xdes: _J9..

More information on extensions:

有关扩展的更多信息:



EDIT

编辑

As indicated by Evan Carrol, as of v9.4 you can use gen_random_uuid()

正如 Evan Carrol 所指出的,从 v9.4 开始,您可以使用 gen_random_uuid()

http://www.postgresql.org/docs/9.4/static/pgcrypto.html

http://www.postgresql.org/docs/9.4/static/pgcrypto.html

回答by Patrick

I do not think that you are looking for a random string per se. What you would need for session verification is a string that is guaranteed to be unique. Do you store session verification information for auditing? In that case you need the string to be unique between sessions. I know of two, rather simple approaches:

我认为您本身并不是在寻找随机字符串。会话验证所需的是保证唯一的字符串。您是否存储会话验证信息以供审核?在这种情况下,您需要该字符串在会话之间是唯一的。我知道两种相当简单的方法:

  1. Use a sequence. Good for use on a single database.
  2. Use an UUID. Universally unique, so good on distributed environments too.
  1. 使用序列。适合在单个数据库上使用。
  2. 使用 UUID。普遍独特,在分布式环境中也很好。

UUIDs are guaranteedto be unique by virtue of their algorithm for generation; effectively it is extremelyunlikely that you will generate two identical numbers on any machine, at any time, ever (note that this is much stronger than on random strings, which have a far smaller periodicity than UUIDs).

凭借其生成算法,UUID保证是唯一的;有效是非常不可能的,你会不会产生任何机器上有两个相同的数字,在任何时候,曾(注意,这是不是随机字符串,它比的UUID小得多的周期性更强)。

You need to load the uuid-ossp extension to use UUIDs. Once installed, call any of the available uuid_generate_vXXX() functions in your SELECT, INSERT or UPDATE calls. The uuid type is a 16-byte numeral, but it also has a string representation.

您需要加载 uuid-ossp 扩展才能使用 UUID。安装后,在 SELECT、INSERT 或 UPDATE 调用中调用任何可用的 uuid_generate_vXXX() 函数。uuid 类型是一个 16 字节的数字,但它也有字符串表示。

回答by Jared Beck

@Kavius recommended using pgcrypto, but instead of gen_salt, what about gen_random_bytes? And how about sha512instead of md5?

@Kavius 推荐使用pgcrypto,而不是gen_salt,那么gen_random_bytes呢?而sha512不是md5呢?

create extension if not exists pgcrypto;
select digest(gen_random_bytes(1024), 'sha512');

Docs:

文档:

F.25.5. Random-Data Functions

gen_random_bytes(count integer) returns bytea

Returns count cryptographically strong random bytes. At most 1024 bytes can be extracted at a time. This is to avoid draining the randomness generator pool.

F.25.5。随机数据函数

gen_random_bytes(count integer) 返回字节

返回计数加密强随机字节。一次最多可以提取 1024 个字节。这是为了避免耗尽随机生成器池。

回答by user516487

select * from md5(to_char(random(), '0.9999999999999999'));

select * from md5(to_char(random(), '0.9999999999999999'));