postgresql 如何在 Postgres 的表中为每条记录生成唯一的字符串?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19530736/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 00:27:41  来源:igfitidea点击:

How can I generate a unique string per record in a table in Postgres?

postgresql

提问by John Bachir

Say I have a table like posts, which has typical columns like id, body, created_at. I'd like to generate a unique string with the creation of each post, for use in something like a url shortener. So maybe a 10-character alphanumeric string. It needs to be unique within the table, just like a primary key.

假设我有一个类似于帖子的表,它具有典型的列,例如 id、body、created_at。我想在创建每个帖子时生成一个唯一的字符串,以用于诸如 url 缩短器之类的东西。所以也许是一个 10 个字符的字母数字字符串。它需要在表中是唯一的,就像主键一样。

Ideally there would be a way for Postgres to handle both of these concerns:

理想情况下,Postgres 应该有办法处理这两个问题:

  1. generate the string
  2. ensure its uniqueness
  1. 生成字符串
  2. 确保其唯一性

And they must go hand-in-hand, because my goal is to not have to worry about any uniqueness-enforcing code in my application.

而且它们必须齐头并进,因为我的目标是不必担心我的应用程序中任何强制执行唯一性的代码。

回答by Dwayne Towell

I don't claim the following is efficient, but it is how we have done this sort of thing in the past.

我不认为以下是有效的,但这是我们过去做这种事情的方式。

CREATE FUNCTION make_uid() RETURNS text AS $$
DECLARE
    new_uid text;
    done bool;
BEGIN
    done := false;
    WHILE NOT done LOOP
        new_uid := md5(''||now()::text||random()::text);
        done := NOT exists(SELECT 1 FROM my_table WHERE uid=new_uid);
    END LOOP;
    RETURN new_uid;
END;
$$ LANGUAGE PLPGSQL VOLATILE;

make_uid()can be used as the default for a column in my_table. Something like:

make_uid()可以用作my_table. 就像是:

ALTER TABLE my_table ADD COLUMN uid text NOT NULL DEFAULT make_uid();

md5(''||now()::text||random()::text)can be adjusted to taste. You could consider encode(...,'base64')except some of the characters used in base-64 are not URL friendly.

md5(''||now()::text||random()::text)可以根据口味调整。encode(...,'base64')除了 base-64 中使用的某些字符对 URL 不友好之外,您可以考虑。

回答by Daniel Vérité

Use a Feistel network. This technique works efficiently to generate unique random-looking strings in constant time without any collision.

使用 Feistel 网络。这种技术可以有效地在恒定时间内生成独特的随机字符串,而不会发生任何冲突。

For a version with about 2 billion possible strings (2^31) of 6 letters, see this answer.

对于具有大约 20 亿个可能2^31的 6 个字母的字符串 ( ) 的版本,请参阅此答案

For a 63 bits version based on bigint(9223372036854775808distinct possible values), see this other answer.

对于基于bigint9223372036854775808不同的可能值)的 63 位版本,请参阅此其他答案

You may change the round functionas explained in the first answer to introduce a secret element to have your own series of strings (not guessable).

您可以按照第一个答案中的说明更改round 函数,以引入一个秘密元素来拥有您自己的一系列字符串(不可猜测)。

回答by BIOHAZARD

All existing answers are WRONG because they are based on SELECTwhile generating unique index per table record. Let us assume that we need unique code per record while inserting: Imagine two concurrent INSERTs are happening same time by miracle (which happens very often than you think) for both inserts same code was generated because at the moment of SELECT that code did not exist in table. One instance will INSERT and other will fail.

所有现有答案都是错误的,因为它们基于SELECT,同时为每个表记录生成唯一索引。让我们假设我们在插入时每条记录都需要唯一的代码:想象一下,两个并发的 INSERT 奇迹般地同时发生(这种情况发生的频率比你想象的要高),因为在 SELECT 的那一刻代码不存在,所以两次插入都生成了相同的代码在表中。一个实例将 INSERT 和其他将失败。

First let us create table with code field and add unique index

首先让我们创建带有代码字段的表并添加唯一索引

CREATE TABLE my_table
(
    code TEXT NOT NULL
);

CREATE UNIQUE INDEX ON my_table (lower(code));

Then we should have function or procedure (you can use code inside for trigger also) where we 1. generate new code, 2. try to insert new record with new code and 3. if insert fails try again from step 1

然后我们应该有函数或过程(您也可以使用内部代码作为触发器),我们1. 生成新代码,2. 尝试使用新代码插入新记录 3. 如果插入失败,请从步骤 1 重试

CREATE OR REPLACE PROCEDURE my_table_insert()
AS $$
DECLARE
    new_code TEXT;
BEGIN

    LOOP
        new_code := LOWER(SUBSTRING(MD5(''||NOW()::TEXT||RANDOM()::TEXT) FOR 8));
        BEGIN
            INSERT INTO my_table (code) VALUES (new_code);
            EXIT;
        EXCEPTION WHEN unique_violation THEN

        END;
    END LOOP;

END;
$$ LANGUAGE PLPGSQL;

This is guaranteed error free solution not like other solutions on this thread

这是保证无错误的解决方案,不像该线程上的其他解决方案

回答by holdfenytolvaj

The easiest way probably to use the sequence to guarantee uniqueness (so after the seq add a fix x digit random number):

最简单的方法可能是使用序列来保证唯一性(所以在 seq 之后添加一个 fix x digit 随机数):

CREATE SEQUENCE test_seq;
CREATE TABLE test_table (
  id bigint NOT NULL DEFAULT (nextval('test_seq')::text || (LPAD(floor(random()*100000000)::text, 8, '0')))::bigint,
  txt TEXT
);
insert into test_table (txt) values ('1');
insert into test_table (txt) values ('2');
select id, txt from test_table;

However this will waste a huge amount of records. (Note: the max bigInt is 9223372036854775807 if you use 8 digit random number at the end, you can only have 922337203 records. Thou 8 digit is probably not necessary. Also check the max number for your programming environment!)

然而,这会浪费大量的记录。(注意:如果最后使用 8 位随机数,则最大 bigInt 为 9223372036854775807,则只能有 922337203 条记录。您可能不需要 8 位。还要检查您的编程环境的最大数量!)

Alternatively you can use varchar for the id and even convert the above number with to_hex() or change to base36 like below (but for base36, try to not expose it to customer, in order to avoid some funny string showing up!):

或者,您可以使用 varchar 作为 id,甚至可以使用 to_hex() 转换上述数字或更改为如下所示的 base36(但对于 base36,尽量不要将其暴露给客户,以避免出现一些有趣的字符串!):

PostgreSQL: Is there a function that will convert a base-10 int into a base-36 string?

PostgreSQL:是否有将 base-10 int 转换为 base-36 字符串的函数?

回答by Kuberchaun

Check out a blog by Bruce. This gets you part way there. You will have to make sure it doesn't already exist. Maybe concat the primary key to it?

查看 Bruce 的博客。这让你分道扬镳。您必须确保它不存在。也许连接它的主键?

Generating Random Data Via Sql

通过Sql生成随机数据

"Ever need to generate random data? You can easily do it in client applications and server-side functions, but it is possible to generate random data in sql. The following query generates five lines of 40-character-length lowercase alphabetic strings:"

“曾经需要生成随机数据吗?您可以在客户端应用程序和服务器端函数中轻松完成,但可以在 sql 中生成随机数据。以下查询生成 5 行 40 个字符长度的小写字母字符串:”

  SELECT
(
  SELECT string_agg(x, '')
  FROM (
    SELECT chr(ascii('a') + floor(random() * 26)::integer)
    FROM generate_series(1, 40 + b * 0)
  ) AS y(x)
)
FROM generate_series(1,5) as a(b);

回答by Tomas Greif

Use primary key in your data. If you really need alphanumeric unique string, you can use base-36 encoding. In PostgreSQL you can use thisfunction.

在数据中使用主键。如果您确实需要字母数字唯一字符串,则可以使用 base-36 编码。在 PostgreSQL 中你可以使用这个函数。

Example:

例子:

select base36_encode(generate_series(1000000000,1000000010));

GJDGXS
GJDGXT
GJDGXU
GJDGXV
GJDGXW
GJDGXX
GJDGXY
GJDGXZ
GJDGY0
GJDGY1
GJDGY2