如何透视 MySQL 实体-属性-值模式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/649802/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 12:55:36  来源:igfitidea点击:

How to pivot a MySQL entity-attribute-value schema

mysqldatabase-designpivotentity-attribute-value

提问by

I need to design tables which stores all the metadata of files (i.e., file name, author, title, date created), and custom metadata (which has been added to files by users, e.g. CustUseBy, CustSendBy). The number of custom metadata fields cannot be set beforehand. Indeed, the only way of determining what and how many custom tags have been added on files is to examine what exists in the tables.

我需要设计存储文件的所有元数据(即文件名、作者、标题、创建日期)和自定义元数据(已由用户添加到文件中,例如 CustUseBy、CustSendBy)的表。无法预先设置自定义元数据字段的数量。实际上,确定在文件上添加了哪些自定义标签以及添加了多少自定义标签的唯一方法是检查表中存在的内容。

To store this, I have created a base table (having all common metadata of files), an Attributestable (holding additional, optional attributes that may be set on files) and a FileAttributestable (which assigns a value to an attribute for a file).

为了存储它,我创建了一个基表(包含文件的所有通用元数据)、一个Attributes表(保存可以在文件上设置的附加可选属性)和一个FileAttributes表(它为文件的属性分配一个值)。

CREAT TABLE FileBase (
    id VARCHAR(32) PRIMARY KEY,
    name VARCHAR(255) UNIQUE NOT NULL,
    title VARCHAR(255),
    author VARCHAR(255),
    created DATETIME NOT NULL,
) Engine=InnoDB;

CREATE TABLE Attributes (
    id VARCHAR(32) PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    type VARCHAR(255) NOT NULL
) Engine=InnoDB;

CREATE TABLE FileAttributes (
    sNo INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
    fileId VARCHAR(32) NOT NULL,
    attributeId VARCHAR(32) NOT NULL,
    attributeValue VARCHAR(255) NOT NULL,
    FOREIGN KEY fileId REFERENCES FileBase (id),
    FOREIGN KEY attributeId REFERENCES Attributes (id)
 ) Engine=InnoDB;

Sample data:

样本数据:

INSERT INTO FileBase
(id,      title,  author,  name,        created)
  VALUES
('F001', 'Dox',   'vinay', 'story.dox', '2009/01/02 15:04:05'),
('F002', 'Excel', 'Ajay',  'data.xls',  '2009/02/03 01:02:03');

INSERT INTO Attributes
(id,      name,            type)
  VALUES
('A001', 'CustomeAttt1',  'Varchar(40)'),
('A002', 'CustomUseDate', 'Datetime');

INSERT INTO FileAttributes 
(fileId, attributeId, attributeValue)
  VALUES
('F001', 'A001',      'Akash'),
('F001', 'A002',      '2009/03/02');

Now the problem is I want to show the data in a manner like this:

现在的问题是我想以这样的方式显示数据:

FileId, Title, Author, CustomAttri1, CustomAttr2, ...
F001    Dox    vinay   Akash         2009/03/02   ...
F002    Excel  Ajay     

What query will generate this result?

什么查询会产生这个结果?

回答by nawroth

The question mentions MySQL, and in fact this DBMS has a special function for this kind of problem: GROUP_CONCAT(expr). Take a look in the MySQL reference manual on group-by-functions. The function was added in MySQL version 4.1. You'll be using GROUP BY FileIDin the query.

问题提到了MySQL,实际上这个DBMS对于这种问题有一个特殊的功能:GROUP_CONCAT(expr). 查看有关 group-by-functionsMySQL 参考手册。该功能是在 MySQL 4.1 版中添加的。您将GROUP BY FileID在查询中使用。

I'm not really sure about how you want the result to look. If you want every attribute listed for every item (even if not set), it will be harder. However, this is my suggestion for how to do it:

我不太确定您希望结果如何。如果您希望为每个项目列出每个属性(即使未设置),这将更加困难。但是,这是我对如何做到这一点的建议:

SELECT bt.FileID, Title, Author, 
 GROUP_CONCAT(
  CONCAT_WS(':', at.AttributeName, at.AttributeType, avt.AttributeValue) 
  ORDER BY at.AttributeName SEPARATOR ', ') 
FROM BaseTable bt JOIN AttributeValueTable avt ON avt.FileID=bt.FileID 
 JOIN AttributeTable at ON avt.AttributeId=at.AttributeId 
GROUP BY bt.FileID;

This gives you all attributes in the same order, which could be useful. The output will be like the following:

这会以相同的顺序为您提供所有属性,这可能很有用。输出将如下所示:

'F001', 'Dox', 'vinay', 'CustomAttr1:varchar(40):Akash, CustomUseDate:Datetime:2009/03/02'

This way you only need one single DB query, and the output is easy to parse. If you want to store the attributes as real Datetime etc. in the DB, you'd need to use dynamic SQL, but I'd stay clear from that and store the values in varchars.

这样你只需要一个单一的数据库查询,并且输出很容易解析。如果您想将属性作为真实日期时间等存储在数据库中,您需要使用动态 SQL,但我会保持清醒,并将值存储在 varchars 中。

回答by Paul Dixon

The general form of such a query would be

这种查询的一般形式是

SELECT file.*,
   attr1.value AS 'Attribute 1 Name', 
   attr2.value AS 'Attribute 2 Name', 
   ...
FROM
   file 
   LEFT JOIN attr AS attr1 
      ON(file.FileId=attr1.FileId and attr1.AttributeId=1)
   LEFT JOIN attr AS attr2 
      ON(file.FileId=attr2.FileId and attr2.AttributeId=2)
   ...

So you need to dynamically build your query from the attributes you need. In php-ish pseudocode

因此,您需要根据所需的属性动态构建查询。在 php-ish 伪代码中

$cols="file";
$joins="";

$rows=$db->GetAll("select * from Attributes");
foreach($rows as $idx=>$row)
{
   $alias="attr{$idx}";
   $cols.=", {$alias}.value as '".mysql_escape_string($row['AttributeName'])."'";   
   $joins.="LEFT JOIN attr as {$alias} on ".
       "(file.FileId={$alias}.FileId and ".
       "{$alias}.AttributeId={$row['AttributeId']}) ";
}

 $pivotsql="select $cols from file $joins";

回答by methai

If you're looking for something more usable (and joinable) than a group-concat result, try this solution below. I've created some tables very similar to your example to make this make sense.

如果您正在寻找比 group-concat 结果更有用(和可连接)的东西,请尝试下面的这个解决方案。我创建了一些与您的示例非常相似的表格,以使其有意义。

This works when:

这在以下情况下有效:

  • You want a pure SQL solution (no code, no loops)
  • You have a predictable set of attributes (e.g. not dynamic)
  • You are OK updating the query when new attribute types need to be added
  • You would prefer a result that can be JOINed to, UNIONed, or nested as a subselect
  • 你想要一个纯 SQL 解决方案(没有代码,没有循环)
  • 您有一组可预测的属性(例如,不是动态的)
  • 当需要添加新的属性类型时,您可以更新查询
  • 您更喜欢可以加入、联合或嵌套为子选择的结果

Table A (Files)

表 A(文件)

FileID, Title, Author, CreatedOn

Table B (Attributes)

表 B(属性)

AttrID, AttrName, AttrType [not sure how you use type...]

Table C (Files_Attributes)

表 C(Files_Attributes)

FileID, AttrID, AttrValue

A traditional query would pull many redundant rows:

传统查询会提取许多冗余行:

SELECT * FROM 
Files F 
LEFT JOIN Files_Attributes FA USING (FileID)
LEFT JOIN Attributes A USING (AttributeID);
AttrID  FileID  Title           Author  CreatedOn   AttrValue   AttrName    AttrType
50      1       TestFile        Joe     2011-01-01  true        ReadOnly        bool
60      1       TestFile        Joe     2011-01-01  xls         FileFormat      text
70      1       TestFile        Joe     2011-01-01  false       Private         bool
80      1       TestFile        Joe     2011-01-01  2011-10-03  LastModified    date
60      2       LongNovel       Mary    2011-02-01  json        FileFormat      text
80      2       LongNovel       Mary    2011-02-01  2011-10-04  LastModified    date
70      2       LongNovel       Mary    2011-02-01  true        Private         bool
50      2       LongNovel       Mary    2011-02-01  true        ReadOnly        bool
50      3       ShortStory      Susan   2011-03-01  false       ReadOnly        bool
60      3       ShortStory      Susan   2011-03-01  ascii       FileFormat      text
70      3       ShortStory      Susan   2011-03-01  false       Private         bool
80      3       ShortStory      Susan   2011-03-01  2011-10-01  LastModified    date
50      4       ProfitLoss      Bill    2011-04-01  false       ReadOnly        bool
70      4       ProfitLoss      Bill    2011-04-01  true        Private         bool
80      4       ProfitLoss      Bill    2011-04-01  2011-10-02  LastModified    date
60      4       ProfitLoss      Bill    2011-04-01  text        FileFormat      text
50      5       MonthlyBudget   George  2011-05-01  false       ReadOnly        bool
60      5       MonthlyBudget   George  2011-05-01  binary      FileFormat      text
70      5       MonthlyBudget   George  2011-05-01  false       Private         bool
80      5       MonthlyBudget   George  2011-05-01  2011-10-20  LastModified    date

This coalescing query (approach using MAX) can merge the rows:

此合并查询(使用 MAX 的方法)可以合并行:

SELECT
F.*,
MAX( IF(A.AttrName = 'ReadOnly', FA.AttrValue, NULL) ) as 'ReadOnly',
MAX( IF(A.AttrName = 'FileFormat', FA.AttrValue, NULL) ) as 'FileFormat',
MAX( IF(A.AttrName = 'Private', FA.AttrValue, NULL) ) as 'Private',
MAX( IF(A.AttrName = 'LastModified', FA.AttrValue, NULL) ) as 'LastModified'
FROM 
Files F 
LEFT JOIN Files_Attributes FA USING (FileID)
LEFT JOIN Attributes A USING (AttributeID)
GROUP BY
F.FileID;
FileID  Title           Author  CreatedOn   ReadOnly    FileFormat  Private LastModified
1       TestFile        Joe     2011-01-01  true        xls         false   2011-10-03
2       LongNovel       Mary    2011-02-01  true        json        true    2011-10-04
3       ShortStory      Susan   2011-03-01  false       ascii       false   2011-10-01
4       ProfitLoss      Bill    2011-04-01  false       text        true    2011-10-02
5       MonthlyBudget   George  2011-05-01  false       binary      false   2011-10-20

回答by S.Lott

This is the standard "rows to columns" problem in SQL.

这是 SQL 中标准的“行到列”问题。

It is most easily done outside SQL.

它最容易在 SQL 之外完成。

In your application, do the following:

在您的应用程序中,执行以下操作:

  1. Define a simple class to contain the file, the system attributes, and a Collection of user attributes. A list is a good choice for this collection of customer attributes. Let's call this class FileDescription.

  2. Execute a simple join between the file and all of the customer attributes for the file.

  3. Write a loop to assemble FileDescriptions from the query result.

    • Fetch the first row, create a FileDescription and set the first customer attribute.

    • While there are more rows to fetch:

      • Fetch a row
      • If this row's file name does not match the FileDescription we're building: finish building a FileDescription; append this to a result Collection of File Descriptions; create a fresh, empty FileDescription with the given name and first customer attribute.
      • If this row's file name matches the FileDescription we're building: append another customer attribute to the current FileDescription
  1. 定义一个简单的类来包含文件、系统属性和用户属性集合。对于此客户属性集合,列表是一个不错的选择。让我们称这个类为 FileDescription。

  2. 在文件和文件的所有客户属性之间执行简单连接。

  3. 编写一个循环来从查询结果中组装 FileDescriptions。

    • 获取第一行,创建一个 FileDescription 并设置第一个客户属性。

    • 虽然有更多的行要获取:

      • 取一行
      • 如果该行的文件名与我们正在构建的 FileDescription 不匹配:完成构建 FileDescription;将此附加到结果文件描述集合中;使用给定的名称和第一个客户属性创建一个新的空 FileDescription。
      • 如果该行的文件名与我们正在构建的 FileDescription 匹配:将另一个客户属性附加到当前 FileDescription

回答by thoroc

I have been experimenting with the different answers and Methai's answer was the most convenient for me. My current project, although it does uses Doctrine with MySQL, has quite a few loose tables.

我一直在尝试不同的答案,而 Methai 的答案对我来说是最方便的。我当前的项目,虽然它使用 Doctrine 和 MySQL,但有很多松散的表。

The following is the result of my experience with Methai's solution:

以下是我使用 Methai 解决方案的经验结果:

create entity table

创建实体表

DROP TABLE IF EXISTS entity;
CREATE TABLE entity (
    id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
    title VARCHAR(255),
    author VARCHAR(255),
    createdOn DATETIME NOT NULL
) Engine = InnoDB;

create attribute table

创建属性表

DROP TABLE IF EXISTS attribute;
CREATE TABLE attribute (
    id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    type VARCHAR(255) NOT NULL
) Engine = InnoDB;

create attributevalue table

创建属性值表

DROP TABLE IF EXISTS attributevalue;
CREATE TABLE attributevalue (
    id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
    value VARCHAR(255) NOT NULL,
    attribute_id INT UNSIGNED NOT NULL,
    FOREIGN KEY(attribute_id) REFERENCES attribute(id)
 ) Engine = InnoDB;

create entity_attributevalue join table

创建 entity_attributevalue 连接表

DROP TABLE IF EXISTS entity_attributevalue;
CREATE TABLE entity_attributevalue (
    entity_id INT UNSIGNED NOT NULL,
    attributevalue_id INT UNSIGNED NOT NULL,
    FOREIGN KEY(entity_id) REFERENCES entity(id),
    FOREIGN KEY(attributevalue_id) REFERENCES attributevalue(id)
) Engine = InnoDB;

populate entity table

填充实体表

INSERT INTO entity
    (title, author, createdOn)
VALUES
    ('TestFile', 'Joe', '2011-01-01'),
    ('LongNovel', 'Mary', '2011-02-01'),
    ('ShortStory', 'Susan', '2011-03-01'),
    ('ProfitLoss', 'Bill', '2011-04-01'),
    ('MonthlyBudget', 'George', '2011-05-01'),
    ('Paper', 'Jane', '2012-04-01'),
    ('Essay', 'John', '2012-03-01'),
    ('Article', 'Dan', '2012-12-01');

populate attribute table

填充属性表

INSERT INTO attribute
    (name, type)
VALUES
    ('ReadOnly', 'bool'),
    ('FileFormat', 'text'),
    ('Private', 'bool'),
    ('LastModified', 'date');

populate attributevalue table

填充属性值表

INSERT INTO attributevalue 
    (value, attribute_id)
VALUES
    ('true', '1'),
    ('xls', '2'),
    ('false', '3'),
    ('2011-10-03', '4'),
    ('true', '1'),
    ('json', '2'),
    ('true', '3'),
    ('2011-10-04', '4'),
    ('false', '1'),
    ('ascii', '2'),
    ('false', '3'),
    ('2011-10-01', '4'),
    ('false', '1'),
    ('text', '2'),
    ('true', '3'),
    ('2011-10-02', '4'),
    ('false', '1'),
    ('binary', '2'),
    ('false', '3'),
    ('2011-10-20', '4'),
    ('doc', '2'),
    ('false', '3'),
    ('2011-10-20', '4'),
    ('rtf', '2'),
    ('2011-10-20', '4');

populate entity_attributevalue table

填充 entity_attributevalue 表

INSERT INTO entity_attributevalue 
    (entity_id, attributevalue_id)
VALUES
    ('1', '1'),
    ('1', '2'),
    ('1', '3'),
    ('1', '4'),
    ('2', '5'),
    ('2', '6'),
    ('2', '7'),
    ('2', '8'),
    ('3', '9'),
    ('3', '10'),
    ('3', '11'),
    ('3', '12'),
    ('4', '13'),
    ('4', '14'),
    ('4', '15'),
    ('4', '16'),
    ('5', '17'),
    ('5', '18'),
    ('5', '19'),
    ('5', '20'),
    ('6', '21'),
    ('6', '22'),
    ('6', '23'),
    ('7', '24'),
    ('7', '25');

Showing all the records

显示所有记录

SELECT * 
FROM `entity` e
LEFT JOIN `entity_attributevalue` ea ON ea.entity_id = e.id
LEFT JOIN `attributevalue` av ON ea.attributevalue_id = av.id
LEFT JOIN `attribute` a ON av.attribute_id = a.id;
id  title           author  createdOn           entity_id   attributevalue_id   id      value       attribute_id    id      name            type
1   TestFile        Joe     2011-01-01 00:00:00 1           1                   1       true        1               1       ReadOnly        bool
1   TestFile        Joe     2011-01-01 00:00:00 1           2                   2       xls         2               2       FileFormat      text
1   TestFile        Joe     2011-01-01 00:00:00 1           3                   3       false       3               3       Private         bool
1   TestFile        Joe     2011-01-01 00:00:00 1           4                   4       2011-10-03  4               4       LastModified    date
2   LongNovel       Mary    2011-02-01 00:00:00 2           5                   5       true        1               1       ReadOnly        bool
2   LongNovel       Mary    2011-02-01 00:00:00 2           6                   6       json        2               2       FileFormat      text
2   LongNovel       Mary    2011-02-01 00:00:00 2           7                   7       true        3               3       Private         bool
2   LongNovel       Mary    2011-02-01 00:00:00 2           8                   8       2011-10-04  4               4       LastModified    date
3   ShortStory      Susan   2011-03-01 00:00:00 3           9                   9       false       1               1       ReadOnly        bool
3   ShortStory      Susan   2011-03-01 00:00:00 3           10                  10      ascii       2               2       FileFormat      text
3   ShortStory      Susan   2011-03-01 00:00:00 3           11                  11      false       3               3       Private         bool
3   ShortStory      Susan   2011-03-01 00:00:00 3           12                  12      2011-10-01  4               4       LastModified    date
4   ProfitLoss      Bill    2011-04-01 00:00:00 4           13                  13      false       1               1       ReadOnly        bool
4   ProfitLoss      Bill    2011-04-01 00:00:00 4           14                  14      text        2               2       FileFormat      text
4   ProfitLoss      Bill    2011-04-01 00:00:00 4           15                  15      true        3               3       Private         bool
4   ProfitLoss      Bill    2011-04-01 00:00:00 4           16                  16      2011-10-02  4               4       LastModified    date
5   MonthlyBudget   George  2011-05-01 00:00:00 5           17                  17      false       1               1       ReadOnly        bool
5   MonthlyBudget   George  2011-05-01 00:00:00 5           18                  18      binary      2               2       FileFormat      text
5   MonthlyBudget   George  2011-05-01 00:00:00 5           19                  19      false       3               3       Private         bool
5   MonthlyBudget   George  2011-05-01 00:00:00 5           20                  20      2011-10-20  4               4       LastModified    date
6   Paper           Jane    2012-04-01 00:00:00 6           21                  21      binary      2               2       FileFormat      text
6   Paper           Jane    2012-04-01 00:00:00 6           22                  22      false       3               3       Private         bool
6   Paper           Jane    2012-04-01 00:00:00 6           23                  23      2011-10-20  4               4       LastModified    date
7   Essay           John    2012-03-01 00:00:00 7           24                  24      binary      2               2       FileFormat      text
7   Essay           John    2012-03-01 00:00:00 7           25                  25      2011-10-20  4               4       LastModified    date
8   Article         Dan     2012-12-01 00:00:00 NULL        NULL                NULL    NULL        NULL            NULL    NULL            NULL

pivot table

数据透视表

SELECT e.*,
    MAX( IF(a.name = 'ReadOnly', av.value, NULL) ) as 'ReadOnly',
    MAX( IF(a.name = 'FileFormat', av.value, NULL) ) as 'FileFormat',
    MAX( IF(a.name = 'Private', av.value, NULL) ) as 'Private',
    MAX( IF(a.name = 'LastModified', av.value, NULL) ) as 'LastModified'
FROM `entity` e
LEFT JOIN `entity_attributevalue` ea ON ea.entity_id = e.id
LEFT JOIN `attributevalue` av ON ea.attributevalue_id = av.id
LEFT JOIN `attribute` a ON av.attribute_id = a.id
GROUP BY e.id;
id  title           author  createdOn           ReadOnly    FileFormat  Private LastModified
1   TestFile        Joe     2011-01-01 00:00:00 true        xls         false   2011-10-03
2   LongNovel       Mary    2011-02-01 00:00:00 true        json        true    2011-10-04
3   ShortStory      Susan   2011-03-01 00:00:00 false       ascii       false   2011-10-01
4   ProfitLoss      Bill    2011-04-01 00:00:00 false       text        true    2011-10-02
5   MonthlyBudget   George  2011-05-01 00:00:00 false       binary      false   2011-10-20
6   Paper           Jane    2012-04-01 00:00:00 NULL        binary      false   2011-10-20
7   Essay           John    2012-03-01 00:00:00 NULL        binary      NULL    2011-10-20
8   Article         Dan     2012-12-01 00:00:00 NULL        NULL        NULL    NULL

回答by MarmouCorp

However there are solutions to use lines as columns, aka transpose the data. It involve query tricks to do it in pure SQL, or you will have to rely on certain features only avaible in certain database, using Pivot tables (or Cross tables).

然而,有使用行作为列的解决方案,也就是转置数据。它涉及在纯 SQL 中使用查询技巧,或者您将不得不依赖某些仅在某些数据库中可用的功能,使用数据透视表(或交叉表)。

As exemple you can see how to do this here in Oracle (11g).

例如,您可以在 Oracle (11g) 中看到如何在此处执行此操作。

The programming version will be simplier to maintain and to make and moreover will work with any database.

编程版本将更易于维护和制作,而且适用于任何数据库。

回答by Sascha

Partial answer since I do not know MySQL (well). In MSSQL I would look at Pivot tables or would create a temporary table in a stored procedure. It may well be a hard time ...

部分答案,因为我不知道 MySQL(好吧)。在 MSSQL 中,我会查看数据透视表或在存储过程中创建一个临时表。这可能是一段艰难的时期......