.net 生成人类可读/可用的、简短但唯一的 ID

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9543715/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 16:17:19  来源:igfitidea点击:

Generating human-readable/usable, short but unique IDs

.netdatabaseidentity

提问by Kumar

  • Need to handle > 1000 but < 10000 new records per day

  • Cannot use GUID/UUIDs, auto increment numbers etc.

  • Ideally should be 5 or 6 chars long, can be alpha of course

  • Would like to reuse existing, well-known algos, if available

  • 每天需要处理 > 1000 但 < 10000 条新记录

  • 不能使用 GUID/UUID、自动递增数字等。

  • 理想情况下应该是 5 或 6 个字符长,当然可以是 alpha

  • 如果可用,希望重用现有的知名算法

Anything out there ?

外面有什么吗?

回答by Paul Sasik

Base 62 is used by tinyurl and bit.ly for the abbreviated URLs. It's a well-understood method for creating "unique", human-readable IDs. Of course you will have to store the created IDs and check for duplicates on creation to ensure uniqueness.(See code at bottom of answer)

tinyurl 和 bit.ly 使用 Base 62 作为缩写的 URL。这是创建“唯一”、人类可读的 ID 的一种易于理解的方法。当然,您必须存储创建的 ID 并在创建时检查重复项以确保唯一性。(见答案底部的代码)

Base 62 uniqueness metrics

基于 62 个唯一性指标

5 chars in base 62 will give you 62^5 unique IDs = 916,132,832 (~1 billion) At 10k IDs per day you will be ok for 91k+ days

base 62 中的 5 个字符将为您提供 62^5 个唯一 ID = 916,132,832(约 10 亿)每天 10k ID,您可以使用 91k+ 天

6 chars in base 62 will give you 62^6 unique IDs = 56,800,235,584 (56+ billion) At 10k IDs per day you will be ok for 5+ million days

base 62 中的 6 个字符将为您提供 62^6 个唯一 ID = 56,800,235,584(56+ 十亿)每天 10k ID,您将可以使用 5+ 百万天

Base 36 uniqueness metrics

基于 36 个唯一性指标

6 chars will give you 36^6 unique IDs = 2,176,782,336 (2+ billion)

6 个字符将为您提供 36^6 个唯一 ID = 2,176,782,336(2+ 十亿)

7 chars will give you 36^7 unique IDs = 78,364,164,096 (78+ billion)

7 个字符将为您提供 36^7 个唯一 ID = 78,364,164,096(78+ 十亿)

Code:

代码:

public void TestRandomIdGenerator()
{
    // create five IDs of six, base 62 characters
    for (int i=0; i<5; i++) Console.WriteLine(RandomIdGenerator.GetBase62(6));

    // create five IDs of eight base 36 characters
    for (int i=0; i<5; i++) Console.WriteLine(RandomIdGenerator.GetBase36(8));
}

public static class RandomIdGenerator 
{
    private static char[] _base62chars = 
        "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
        .ToCharArray();

    private static Random _random = new Random();

    public static string GetBase62(int length) 
    {
        var sb = new StringBuilder(length);

        for (int i=0; i<length; i++) 
            sb.Append(_base62chars[_random.Next(62)]);

        return sb.ToString();
    }       

    public static string GetBase36(int length) 
    {
        var sb = new StringBuilder(length);

        for (int i=0; i<length; i++) 
            sb.Append(_base62chars[_random.Next(36)]);

        return sb.ToString();
    }
}

Output:

输出:

z5KyMg
wd4SUp
uSzQtH
UPrGAT
UIf2IS

QCF9GNM5
0UV3TFSS
3MG91VKP
7NTRF10T
AJK3AJU7

回答by Slawa

I recommend http://hashids.org/which converts any number (e.g. DB ID) into a string (using salt).

我推荐http://hashids.org/它将任何数字(例如 DB ID)转换为字符串(使用盐)。

It allows decoding this string back to the number.So you don't need to store it in the database.

它允许将此字符串解码回数字。所以你不需要将它存储在数据库中。

Has libs for JavaScript, Ruby, Python, Java, Scala, PHP, Perl, Swift, Clojure, Objective-C, C, C++11, Go, Erlang, Lua, Elixir, ColdFusion, Groovy, Kotlin, Nim, VBA, CoffeeScript and for Node.js & .NET.

拥有 JavaScript、Ruby、Python、Java、Scala、PHP、Perl、Swift、Clojure、Objective-C、C、C++11、Go、Erlang、Lua、Elixir、ColdFusion、Groovy、Kotlin、Nim、VBA、 CoffeeScript 和 Node.js 和 .NET。

回答by Stijn de Witt

I had similar requirements as the OP. I looked into available libraries but most of them are based on randomness and I didn't want that. I could not really find anything that was not based on random and still very short... So I ended up rolling my own based on the technique Flickr uses, but modified to require less coordination and allow for longer periods offline.

我和OP有类似的要求。我查看了可用的库,但其中大多数是基于随机性的,我不想要这样。我真的找不到任何不是基于随机并且仍然很短的东西......所以我最终根据Flickr 使用的技术滚动了我自己的,但修改为需要更少的协调并允许更长的离线时间。

In short:

简而言之:

  • A central server issues ID blocks consisting of 32 IDs each
  • The local ID generator maintains a pool of ID blocks to generate an ID every time one is requested. When the pool runs low it fetches more ID blocks from the server to fill it up again.
  • 中央服务器发布 ID 块,每个块由 32 个 ID 组成
  • 本地 ID 生成器维护一个 ID 块池,以便在每次请求时生成一个 ID。当池耗尽时,它会从服务器获取更多 ID 块以再次填充它。

Disadvantages:

缺点:

  • Requires central coordination
  • IDs are more or less predictable (less so than regular DB ids but they aren't random)
  • 需要中央协调
  • ID 或多或少是可预测的(不如常规数据库 ID,但它们不是随机的)

Advantages

好处

  • Stays within 53 bits (Javascript / PHP max size for integer numbers)
  • veryshort IDs
  • Base 36 encoded so very easy for humans to read, write and pronounce
  • IDs can be generated locally for a very long time before needing contact with the server again (depending on pool settings)
  • Theoretically no chance of collissions
  • 保持在 53 位以内(Javascript / PHP 整数的最大大小)
  • 非常短的 ID
  • Base 36 编码非常易于人类阅读、书写和发音
  • 在需要再次与服务器联系之前,可以在本地生成 ID 很长时间(取决于池设置)
  • 理论上不会发生碰撞

I have published both a Javascript library for the client side, as well as a Java EE server implementation. Implementing servers in other languages should be easy as well.

我已经发布了客户端的 Javascript 库和 Java EE 服务器实现。用其他语言实现服务器也应该很容易。

Here are the projects:

以下是项目:

suid- Distributed Service-Unique IDs that are short and sweet

suid- 简短而甜蜜的分布式服务唯一 ID

suid-server-java- Suid-server implementation for the Java EE technology stack.

suid-server-java- Java EE 技术堆栈的 Suid-server 实现。

Both libraries are available under a liberal Creative Commons open source license. Hoping this may help someone else looking for short unique IDs.

这两个库都在自由的知识共享开源许可下可用。希望这可以帮助其他人寻找简短的唯一 ID。

回答by Warren Smith

I used base 36when I solved this problem for an application I was developing a couple of years back. I needed to generate a human readable reasonably unique number(within the current calendar year anyway). I chose to use the time in milliseconds from midnight on Jan 1st of the current year (so each year, the timestamps could duplicate) and convert it to a base 36 number. If the system being developed ran into a fatal issue it generated the base 36 number (7 chars) that was displayed to an end user via the web interface who could then relay the issue encountered (and the number) to a tech support person (who could then use it to find the point in the logs where the stacktrace started). A number like 56af42g7is infinitely easier for a user to read and relay than a timestamp like 2016-01-21T15:34:29.933-08:00or a random UUID like 5f0d3e0c-da96-11e5-b5d2-0a1d41d68578.

当我为几年前开发的应用程序解决这个问题时,我使用了base 36。我需要生成一个人类可读的合理唯一的数字(无论如何在当前日历年内)。我选择使用从当年 1 月 1 日午夜开始的以毫秒为单位的时间(因此每年的时间戳都可能重复)并将其转换为以 36 为基数的数字。如果正在开发的系统遇到致命问题,它会生成基本 36 位数字(7 个字符),并通过 Web 界面显示给最终用户,最终用户可以将遇到的问题(和数字)转发给技术支持人员(他们然后可以使用它来查找日志中堆栈跟踪开始的点)。像56af42g7这样的数字2016-01-21T15:34:29.933-08:00 之类的时间戳或5f0d3e0c-da96-11e5-b5d2-0a1d41d68578 之类的随机 UUID相比,用户阅读和转发要容易得多