C# 需要一个更小的 DB ID 替代 GUID,但 URL 仍然是唯一和随机的

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/529647/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 06:55:20  来源:igfitidea点击:

Need a smaller alternative to GUID for DB ID but still unique and random for URL

c#asp.netdatabaseurl

提问by uriDium

I have looked all of the place for this and I can't seem to get a complete answer for this. So if the answer does already exist on stackoverflow then I apologize in advance.

我已经为此寻找了所有地方,但似乎无法得到完整的答案。因此,如果答案在 stackoverflow 上已经存在,那么我提前道歉。

I want a unique and random ID so that users in my website can't guess the next number and just hop to someone else's information. I plan to stick to a incrementing ID for the primary key but to also store a random and unique ID (sort of a hash) for that row in the DB and put an index on it.

我想要一个唯一且随机的 ID,这样我网站中的用户就无法猜测下一个数字而只能跳到其他人的信息。我计划坚持主键的递增 ID,但还要在数据库中为该行存储一个随机且唯一的 ID(某种散列)并在其上放置一个索引。

From my searching I realize that I would like to avoid collisions and I have read some mentions of SHA1.

从我的搜索中,我意识到我想避免冲突,并且我已经阅读了一些关于 SHA1 的内容。

My basic requirements are

我的基本要求是

  • Something smaller than a GUID. (Looks horrible in URL)
  • Must be unique
  • Avoid collisions
  • Not a long list of strange characters that are unreadable.
  • 比 GUID 小的东西。(在 URL 中看起来很糟糕)
  • 必须是唯一的
  • 避免碰撞
  • 不是一长串不可读的奇怪字符。

An example of what I am looking for would be www.somesite.com/page.aspx?id=AF78FEB

我正在寻找的一个例子是 www.somesite.com/page.aspx?id=AF78FEB

I am not sure whether I should be implementing this in the database (I am using SQL Server 2005) or in the code (I am using C# ASP.Net)

我不确定是否应该在数据库(我使用 SQL Server 2005)或代码(我使用 C# ASP.Net)中实现它

EDIT:

编辑:

From all the reading I have done I realize that this is security through obscurity. I do intend having proper authorization and authentication for access to the pages. I will use .Net's Authentication and authorization framework. But once a legitimate user has logged in and is accessing a legimate (but dynamically created page) filled with links to items that belong to him. For example a link might be www.site.com/page.aspx?item_id=123. What is stopping him from clicking on that link, then altering the URL above to go www.site.com/page.aspx?item_id=456 which does NOT belong to him? I know some Java technologies like Struts (I stand to be corrected) store everything in the session and somehow work it out from that but I have no idea how this is done.

从我所做的所有阅读中,我意识到这是通过默默无闻的安全性。我确实打算对页面进行适当的授权和身份验证。我将使用 .Net 的身份验证和授权框架。但是一旦合法用户登录并访问一个合法的(但动态创建的页面),该页面充满了属于他的项目的链接。例如,链接可能是 www.site.com/page.aspx?item_id=123。是什么阻止他点击该链接,然后将上面的 URL 更改为不属于他的 www.site.com/page.aspx?item_id=456?我知道像 Struts 这样的一些 Java 技术(我将得到纠正)将所有内容存储在会话中并以某种方式从中解决,但我不知道这是如何完成的。

采纳答案by Greg

[In response to the edit]
You should consider query strings as "evil input". You need to programmatically check that the authenticated user is allowed to view the requested item.

[响应编辑]
您应该将查询字符串视为“邪恶输入”。您需要以编程方式检查是否允许经过身份验证的用户查看请求的项目。

if( !item456.BelongsTo(user123) )
{
  // Either show them one of their items or a show an error message.
}

回答by David Basarab

If you don't want other users to see people information why don't you secure the page which you are using the id?

如果您不想让其他用户看到人们的信息,为什么不保护您使用 id 的页面?

If you do that then it won't matter if you use an incrementing Id.

如果你这样做,那么使用递增的 Id 就没有关系了。

回答by webjunkie

You could randomly generate a number. Check that this number is not already in the DB and use it. If you want it to appear as a random string you could just convert it to hexadecimal, so you get A-F in there just like in your example.

您可以随机生成一个数字。检查该号码是否已存在于数据库中并使用它。如果您希望它显示为随机字符串,您可以将其转换为十六进制,这样您就可以像在您的示例中一样在那里获得 AF。

回答by Zhaph - Ben Duguid

Raymond Chenhas a good article on why you shouldn't use "half a guid", and offers a suitable solution to generating your own "not quite guid but good enough" type value here:

Raymond Chen有一篇关于为什么不应该使用“半个 guid”的好文章,并提供了一个合适的解决方案来生成您自己的“不太 guid 但足够好”类型值:

GUIDs are globally unique, but substrings of GUIDs aren't

GUID 是全局唯一的,但 GUID 的子字符串不是

His strategy (without a specific implementiation) was based on:

他的策略(没有具体实施)基于:

  • Four bits to encode the computer number,
  • 56 bits for the timestamp, and
  • four bits as a uniquifier.

We can reduce the number of bits to make the computer unique since the number of computers in the cluster is bounded, and we can reduce the number of bits in the timestamp by assuming that the program won't be in service 200 years from now.

You can get away with a four-bit uniquifier by assuming that the clock won't drift more than an hour out of skew (say) and that the clock won't reset more than sixteen times per hour.

  • 四位编码计算机数字,
  • 时间戳为 56 位,以及
  • 四位作为唯一标识符。

由于集群中的计算机数量是有界的,我们可以减少位数以使计算机唯一,并且我们可以假设该程序在 200 年后不会投入使用,从而减少时间戳中的位数。

您可以通过假设时钟不会偏移超过一个小时(例如)并且时钟不会每小时重置超过 16 次来使用四位 uniquifier。

回答by Kibbee

How long is too long? You could convert the GUID to Base 64, which ends up making it quite a bit shorter.

多长时间太长?您可以将 GUID 转换为 Base 64,这最终会使其变得更短。

回答by Gumbo

A GUID is 128 bit. If you take these bits and don't use a character set with just 16 characters to represent them (16=2^4 and 128/4 = 32 chacters) but a character set with, let's say, 64 characters (like Base 64), you would end up at only 22 characters (64=2^6 and 128/6 = 21.333, so 22 characters).

GUID 是 128 位。如果您使用这些位并且不使用仅包含 16 个字符的字符集来表示它们(16=2^4 和 128/4 = 32 个字符),而是使用包含 64 个字符的字符集(例如 Base 64) ,你最终只会得到 22 个字符(64=2^6 和 128/6 = 21.333,所以 22 个字符)。

回答by Jeremy Boyd

What you could do is something I do when I want exactly what you are wanting.

当我想要你想要的东西时,你可以做的就是我做的事情。

  1. Create your GUID.

  2. Get remove the dashes, and get a substring of how long you want your ID

  3. Check the db for that ID, if it exists goto step 1.

  4. Insert record.

  1. 创建您的 GUID。

  2. 删除破折号,并获取您想要 ID 的长度的子字符串

  3. 检查该 ID 的数据库,如果它存在,请转到步骤 1。

  4. 插入记录。

This is the simplest way to insure it is obscured and unique.

这是确保它是模糊和独特的最简单的方法。

回答by CraigTP

UPDATE (4 Feb 2017):
Walter Staboszdiscovered a bug in the original code. Upon investigation there were further bugs discovered, however, extensive testing and reworking of the code by myself, the original author (CraigTP) has now fixed all of these issues. I've updated the code here with the correct working version, and you can also download a Visual Studio 2015 solution herewhich contains the "shortcode" generation code and a fairly comprehensive test suite to prove correctness.

更新(2017 年 2 月 4 日):
Walter Stabosz在原始代码中发现了一个错误。经过调查,发现了进一步的错误,但是,我自己对代码进行了广泛的测试和修改,原始作者(CraigTP)现在已经修复了所有这些问题。我已经使用正确的工作版本更新了此处的代码,您还可以在此处下载 Visual Studio 2015 解决方案,其中包含“短代码”生成代码和相当全面的测试套件以证明正确性。

One interesting mechanism I've used in the past is to internally just use an incrementing integer/long, but to "map" that integer to a alphanumeric "code".

我过去使用过的一种有趣机制是在内部仅使用递增的整数/长整数,但将该整数“映射”为字母数字“代码”。

Example

例子

Console.WriteLine($"1371 as a shortcode is: {ShortCodes.LongToShortCode(1371)}");
Console.WriteLine($"12345 as a shortcode is: {ShortCodes.LongToShortCode(12345)}");
Console.WriteLine($"7422822196733609484 as a shortcode is: {ShortCodes.LongToShortCode(7422822196733609484)}");

Console.WriteLine($"abc as a long is: {ShortCodes.ShortCodeToLong("abc")}");
Console.WriteLine($"ir6 as a long is: {ShortCodes.ShortCodeToLong("ir6")}");
Console.WriteLine($"atnhb4evqqcyx as a long is: {ShortCodes.ShortCodeToLong("atnhb4evqqcyx")}");    

// PLh7lX5fsEKqLgMrI9zCIA   
Console.WriteLine(GuidToShortGuid( Guid.Parse("957bb83c-5f7e-42b0-aa2e-032b23dcc220") ) );      

Code

代码

The following code shows a simple class that will change a long to a "code" (and back again!):

下面的代码显示了一个简单的类,它将把 long 更改为“代码”(然后再次返回!):

public static class ShortCodes
{
    // You may change the "shortcode_Keyspace" variable to contain as many or as few characters as you
    // please.  The more characters that are included in the "shortcode_Keyspace" constant, the shorter
    // the codes you can produce for a given long.
    private static string shortcodeKeyspace = "abcdefghijklmnopqrstuvwxyz0123456789";

    public static string LongToShortCode(long number)
    {
        // Guard clause.  If passed 0 as input
        // we always return empty string.
        if (number == 0)
        {
            return string.Empty;
        }

        var keyspaceLength = shortcodeKeyspace.Length;
        var shortcodeResult = "";
        var numberToEncode = number;
        var i = 0;
        do
        {
            i++;
            var characterValue = numberToEncode % keyspaceLength == 0 ? keyspaceLength : numberToEncode % keyspaceLength;
            var indexer = (int) characterValue - 1;
            shortcodeResult = shortcodeKeyspace[indexer] + shortcodeResult;
            numberToEncode = ((numberToEncode - characterValue) / keyspaceLength);
        }
        while (numberToEncode != 0);
        return shortcodeResult;
    }

    public static long ShortCodeToLong(string shortcode)
    {
        var keyspaceLength = shortcodeKeyspace.Length;
        long shortcodeResult = 0;
        var shortcodeLength = shortcode.Length;
        var codeToDecode = shortcode;
        foreach (var character in codeToDecode)
        {
            shortcodeLength--;
            var codeChar = character;
            var codeCharIndex = shortcodeKeyspace.IndexOf(codeChar);
            if (codeCharIndex < 0)
            {
                // The character is not part of the keyspace and so entire shortcode is invalid.
                return 0;
            }
            try
            {
                checked
                {
                    shortcodeResult += (codeCharIndex + 1) * (long) (Math.Pow(keyspaceLength, shortcodeLength));
                }
            }
            catch(OverflowException)
            {
                // We've overflowed the maximum size for a long (possibly the shortcode is invalid or too long).
                return 0;
            }
        }
        return shortcodeResult;
    }
}

}

}

This is essentially your own baseX numbering system (where the X is the number of unique characters in the shortCode_Keyspace constant.

这本质上是您自己的 baseX 编号系统(其中 X 是 shortCode_Keyspace 常量中唯一字符的数量。

To make things unpredicable, start your internal incrementing numbering at something other than 1 or 0 (i.e start at 184723) and also change the order of the characters in the shortCode_Keyspace constant (i.e. use the letters A-Z and the numbers 0-9, but scamble their order within the constant string. This will help make each code somewhat unpredictable.

为了使事情变得不可预测,请从 1 或 0 以外的其他数字开始您的内部递增编号(即从 184723 开始),并更改 shortCode_Keyspace 常量中字符的顺序(即使用字母 AZ 和数字 0-9,但可以乱码它们在常量字符串中的顺序。这将有助于使每个代码都有些不可预测。

If you're using this to "protect" anything, this is still security by obscurity, and if a given user can observe enough of these generated codes, they can predict the relevant code for a given long. The "security" (if you can call it that) of this is that the shortCode_Keyspace constant is scrambled, and remains secret.

如果你用它来“保护”任何东西,这仍然是默默无闻的安全,如果给定的用户可以观察到足够多的这些生成的代码,他们可以预测给定时间的相关代码。这样做的“安全性”(如果你可以这么称呼的话)是 shortCode_Keyspace 常量被打乱,并保持秘密。

EDIT: If you just want to generate a GUID, and transform it to something that is still unique, but contains a few less characters, this little function will do the trick:

编辑:如果您只想生成一个 GUID,并将其转换为仍然唯一但包含较少字符的内容,那么这个小函数可以解决问题:

public static string GuidToShortGuid(Guid gooid)
{
    string encoded = Convert.ToBase64String(gooid.ToByteArray());
    encoded = encoded.Replace("/", "_").Replace("+", "-");
    return encoded.Substring(0, 22);
}

回答by Emil Sit

Take your auto-increment ID, and HMAC-SHA1 it with a secret known only to you. This will generate a random-looking 160-bits that hide the real incremental ID. Then, take a prefix of a length that makes collisions sufficiently unlikely for your application---say 64-bits, which you can encode in 8 characters. Use this as your string.

获取您的自动增量 ID,并使用只有您知道的秘密对其进行 HMAC-SHA1。这将生成一个看似随机的 160 位隐藏真实增量 ID。然后,取一个长度的前缀,使您的应用程序不太可能发生冲突——比如 64 位,您可以用 8 个字符进行编码。将此用作您的字符串。

HMAC will guarantee that no one can map from the bits shown back to the underlying number. By hashing an auto-increment ID, you can be pretty sure that it will be unique. So your risk for collisions comes from the likelihood of a 64-bit partial collision in SHA1. With this method, you can predetermine if you will have any collisions by pre-generating all the random strings that this method which generate (e.g. up to the number of rows you expect) and checking.

HMAC 将保证没有人可以将显示的位映射回底层数字。通过散列一个自增 ID,您可以非常确定它是唯一的。因此,您的冲突风险来自 SHA1 中 64 位部分冲突的可能性。使用此方法,您可以通过预先生成此方法生成的所有随机字符串(例如,达到您期望的行数)并检查来预先确定是否会发生任何冲突。

Of course, if you are willing to specify a unique condition on your database column, then simply generating a totally random number will work just as well. You just have to be careful about the source of randomness.

当然,如果您愿意为数据库列指定唯一条件,那么简单地生成一个完全随机数也可以。你只需要注意随机性的来源。

回答by uriDium

I have just had an idea and I see Greg also pointed it out. I have the user stored in the session with a user ID. When I create my query I will join on the Users table with that User ID, if the result set is empty then we know he was hacking the URL and I can redirect to an error page.

我刚刚有了一个想法,我看到 Greg 也指出了它。我将用户存储在带有用户 ID 的会话中。当我创建查询时,我将使用该用户 ID 加入用户表,如果结果集为空,那么我们知道他正在入侵 URL,我可以重定向到错误页面。