SQL 数据库表中的键/值对

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/514603/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 01:01:22  来源:igfitidea点击:

Key/Value pairs in a database table

sqlsql-servertsqlentity-attribute-value

提问by Bill Karwin

I need to design a Key/value table in my database and I'm looking for guidance on the best way to do this. Basically, I need to be able to associate values to a dynamic set of named properties and apply them to an external key.

我需要在我的数据库中设计一个键/值表,我正在寻找有关执行此操作的最佳方法的指导。基本上,我需要能够将值与一组动态命名属性相关联,并将它们应用于外部键。

The operations I need to be able to support are:

我需要能够支持的操作是:

  • Apply a key/value pair to a group of items
  • Enumerate all of the currently-active keys
  • Determine all of the items that have a value for a given key
  • Determine all of the items where the value associated with a given key matches some criteria.
  • 将键/值对应用于一组项目
  • 枚举所有当前活动的键
  • 确定具有给定键值的所有项目
  • 确定与给定键关联的值与某些条件匹配的所有项目。

It seems that the simplest way to do this is to define a table:

似乎最简单的方法是定义一个表:

CREATE TABLE KeyValue (
  id    int,
  Key   varchar...,
  Value varchar...
);

It seems that I am likely to be duplicating a lot of data in the Key column because I any given key is likely to be defined for a large number of documents. Replacing the Key varchar with an integer lookup into another table seems to alleviate this problem (and make it significantly more efficient to enumerate all of the active keys), but sticks me with the problem of maintaining that lookup table (upserting into it whenever I want to define a property and potentially removing the entry any time a key/value is cleared).

似乎我很可能会在 Key 列中复制大量数据,因为我可能会为大量文档定义任何给定的键。将 Key varchar 替换为对另一个表的整数查找似乎可以缓解这个问题(并使枚举所有活动键的效率显着提高),但让我面临维护该查找表的问题(只要我想,就插入它)定义一个属性并在清除键/值时可能删除条目)。

What's the best way to do this?

做到这一点的最佳方法是什么?

回答by Bill Karwin

You are employing a database model called Entity-Attribute-Value. This is a common way to store key/value pairs in a relational database, but it has a number of weaknesses with respect to database normalization and efficiency.

您正在使用一个名为Entity-Attribute-Value的数据库模型。这是在关系数据库中存储键/值对的常用方法,但它在数据库规范化和效率方面存在许多弱点。

Yes, the table design you showed is the most common way to do it. In this design, every attribute of every entity gets a distinct row in your KeyValuetable.

是的,您展示的表格设计是最常用的方法。在此设计中,每个实体的每个属性在您的KeyValue表中都有一个不同的行。

Apply a key/value pair to a group of items:You need to add one row for each item in the group.

将键/值对应用于一组项目:您需要为组中的每个项目添加一行。

INSERT INTO KeyValue (id, key, value) VALUES (101, 'color', 'green');
INSERT INTO KeyValue (id, key, value) VALUES (102, 'color', 'green');
INSERT INTO KeyValue (id, key, value) VALUES (103, 'color', 'green');

You may also prepare the INSERT statement with parameters and run through a number of item id's in a loop, or whatever.

您还可以准备带有参数的 INSERT 语句,并在循环中运行多个项目 ID,或其他任何内容。

Enumerate all of the currently-active keys:

枚举所有当前活动的键:

SELECT DISTINCT Key FROM KeyValue;

Determine all of the items that have a value for a given key:

确定具有给定键值的所有项目:

SELECT id FROM KeyValue WHERE Key = 'color';

Determine all of the items where the value associated with a given key matches some criteria:

确定与给定键关联的值与某些条件匹配的所有项目:

SELECT id FROM KeyValue WHERE Value = 'green';

Some of the problems with Entity-Attribute-Value are:

Entity-Attribute-Value 的一些问题是:

  • No way to make sure keys are spelled the same for all items
  • No way to make some keys mandatory for all items (i.e. NOT NULL in a conventional table design).
  • All keys must use VARCHAR for the value; can't store different data types per key.
  • No way to use referential integrity; can't make a FOREIGN KEY that applies to values of some keys and not others.
  • 无法确保所有项目的键拼写相同
  • 无法为所有项目强制设置某些键(即在传统表设计中为 NOT NULL)。
  • 所有键必须使用 VARCHAR 作为值;每个键不能存储不同的数据类型。
  • 无法使用参照完整性;不能制作适用于某些键的值而不适用于其他键的值的外键。

Basically, Entity-Attribute-Value is not a normalized database design.

基本上,Entity-Attribute-Value 不是规范化的数据库设计。

回答by gavinandresen

Don't optimize this unless you have to. What is the average length of a key? Will this table be so big it won't all fit into your server's memory if you implement it the naive way? I'd suggest implementing it the simplest way, measure the performance, and then re-implement only if performance is a problem.

除非必须,否则不要优化它。密钥的平均长度是多少?如果您以幼稚的方式实现它,这张表是否会大到无法全部放入服务器的内存中?我建议以最简单的方式实现它,测量性能,然后只有在性能有问题时才重新实现。

If performance is a problem, then using an integer key and a separate table is probably the way to go (JOINS on integer columns are typically faster than JOINS using variable-length-string columns). But the first rule of optimizing is MEASURE FIRST-- make sure your supposedly-optimized code actually does make thing run faster.

如果性能有问题,那么使用整数键和单独的表可能是可行的方法(整数列上的 JOINS 通常比使用可变长度字符串列的 JOINS 快)。但是优化的第一条规则是测量优先——确保你所谓的优化代码确实让事情运行得更快。

回答by Sam Saffron

An option that may be worth exploring is digesting the key using SHA1 or MD5 before inserting it into the table.

一个值得探索的选项是在将密钥插入表之前使用 SHA1 或 MD5 消化密钥。

That will allow you to get rid of the lookup table, but you will not be able to iterate through the keys cause it only goes one way.

这将允许您摆脱查找表,但您将无法遍历键,因为它只能以一种方式进行。

回答by Learning

Create updatable views!. Also checkthis for an example.

创建可更新的视图!. 还请检查此示例。

回答by Nathan Feger

It seems to me like you might have a couple design choices.

在我看来,您可能有几个设计选择。

Choice 1: A two table design you hinted at in your answer

选择 1:您在答案中暗示的两张桌子设计

Keys (
 id int not null auto_increment
 key string/int
)
values (
 id int not null auto_increment
 key_id int
 value string/varchar/int
)

Choice 2: perhaps as sambo99 pointed out you could modify this:

选择 2:也许正如 sambo99 指出的那样,您可以修改它:

keys (
 id int not null auto_increment
 key string/int
 hash_code int -- this would be computed by the inserting code, so that lookups would effectively have the id, and you can look them up directly
)

values (
 id int not null auto_increment -- this column might be nice since your hash_codes might colide, and this will make deletes/updates easier
 key_id int -- this column becomes optional
 hash_code int
 value string/varchar/int...
)

--

——

回答by FistOfFury

Key value pair is generally not a good use of relational databases. the benefits of relational databases are the constraints, validation and structure that goes with it. By using a generic key-value structure in your table you are losing the validation and constraints that make relational databases good. If you want the flexible design of key value pairs, you would be best served by a NoSQL database like MongoDB or its ilk.

键值对一般不好用关系数据库。关系数据库的好处是它的约束、验证和结构。通过在表中使用通用键值结构,您将失去使关系数据库良好的验证和约束。如果您想要灵活的键值对设计,最好使用 NoSQL 数据库(如 MongoDB 或其同类)。

Key value pair (e.g. NoSQL databases) works best when the underlying data is unstructured, unpredictable, or changing often. If you don't have structured data, a relational database is going to be more trouble than its worth because you will need to make lots of schema changes and/or jump through hoops to conform your data to the ever-changing structure.

键值对(例如 NoSQL 数据库)在底层数据非结构化、不可预测或经常变化时效果最佳。如果您没有结构化数据,那么关系数据库将比它的价值更麻烦,因为您需要进行大量架构更改和/或跳过箍以使您的数据符合不断变化的结构。

KVP / JSON / NoSql is great because changes to the data structure do not require completely refactoring the data model. Adding a field to your data object is simply a matter of adding it to the data. The other side of the coin is there are fewer constraints and validation checks in a KVP / Nosql database than a relational database so your data might get messy.

KVP / JSON / NoSql 很棒,因为对数据结构的更改不需要完全重构数据模型。向数据对象添加字段只是将其添加到数据中的问题。硬币的另一面是 KVP / Nosql 数据库中的约束和验证检查少于关系数据库,因此您的数据可能会变得混乱。

There are performance and space saving benefits for relational data models. Normalized relational data can make understanding and validating the data easier because there are table key relationships and constraints to help you out. This will make your application easier to maintain and support in the long term. Another approach is to use a data abstraction layer in your code, like Django or SQL Alchemy for Python, Entity Framework for .NET. That way as your code changes your database will change with it automatically.

关系数据模型具有性能和空间节省优势。规范化的关系数据可以使理解和验证数据更容易,因为有表键关系和约束可以帮助您。从长远来看,这将使您的应用程序更易于维护和支持。另一种方法是在代码中使用数据抽象层,例如 Python 的 Django 或 SQL Alchemy、.NET 的实体框架。这样,当您的代码更改时,您的数据库将随之自动更改。

One of the worst patterns i've seen is trying to have it both ways. Trying to put a key-value pair into a relational database is often a recipe for disaster. I would recommend using the technology that suits your data foremost.

我见过的最糟糕的模式之一就是试图同时拥有它。尝试将键值对放入关系数据库通常会导致灾难。我建议首先使用适合您数据的技术。