MySQL 在数据库中存储标签的最佳实践?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3508207/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 16:53:54  来源:igfitidea点击:

Best practice for storing tags in a database?

mysqlperformanceoptimizationtagsstructure

提问by John Dewans

I developed a site that uses tags (key words) in order to categorize photographs. Right now, what I have in my MySQL database is a table with the following structure:

我开发了一个使用标签(关键词)来分类照片的网站。现在,我的 MySQL 数据库中有一个具有以下结构的表:

image_id (int)
tag      (varchar(32))

Every time someone tags an image (if the tag is valid and has enough votes) it's added to the database. I think that this isn't the optimal way of doing things since now that I have 5000+ images with tags, the tags table has over 40000 entries. I fear that this will begin to affect performance (if it's not already affecting it).

每次有人标记图像时(如果标记有效并且有足够的票数),它就会被添加到数据库中。我认为这不是最佳的做事方式,因为现在我有 5000 多张带标签的图像,标签表有超过 40000 个条目。我担心这会开始影响性能(如果它还没有影响它的话)。

I considered this other structure thinking that it'd be faster to fetch the tags associated to a particular image but then it looks horrible for when I want to get all the tags, or the most popular one for instance:

我考虑过另一种结构,认为获取与特定图像关联的标签会更快,但是当我想要获取所有标签或最受欢迎的标签时,它看起来很糟糕,例如:

image_id (int)
tags     (text) //comma delimited list of tags for the image

Is there a correct way of doing this or are both ways more or less the same? Thoughts?

是否有正确的方法来做到这一点,或者两种方式或多或少相同?想法?

回答by OMG Ponies

Use a many-to-many table to link a TAGrecord to an IMAGErecord:

使用多对多表将TAG记录链接到IMAGE记录:

IMAGE

图片

DROP TABLE IF EXISTS `example`.`image`;
CREATE TABLE  `example`.`image` (
  `image_id` int(10) unsigned NOT NULL auto_increment,
  PRIMARY KEY  (`image_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

TAG

标签

DROP TABLE IF EXISTS `example`.`tag`;
CREATE TABLE  `example`.`tag` (
 `tag_id` int(10) unsigned NOT NULL auto_increment,
 `description` varchar(45) NOT NULL default '',
 PRIMARY KEY  (`tag_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

IMAGE_TAG_MAP

IMAGE_TAG_MAP

DROP TABLE IF EXISTS `example`.`image_tag_map`;
CREATE TABLE  `example`.`image_tag_map` (
 `image_id` int(10) unsigned NOT NULL default '0',
 `tag_id` int(10) unsigned NOT NULL default '0',
 PRIMARY KEY  (`image_id`,`tag_id`),
 KEY `tag_fk` (`tag_id`),
 CONSTRAINT `image_fk` FOREIGN KEY (`image_id`) REFERENCES `image` (`image_id`),
 CONSTRAINT `tag_fk` FOREIGN KEY (`tag_id`) REFERENCES `tag` (`tag_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

回答by Dalius ?idlauskas

In multi tag search query you will have to hit every tag that is requested. Hence image tag set Ihas to be a superset of the request tag set U.

在多标签搜索查询中,您必须点击请求的每个标签。因此图像标签集I必须是请求标签集U的超集。

I >= U

To implement this complex comparison in SQL is a bit of challenge as each of the image has to be qualified individually. Given that tags are unique set per image:

在 SQL 中实现这种复杂的比较有点挑战,因为每个图像都必须单独进行限定。鉴于每个图像的标签是唯一的:

SELECT i.* FROM images AS i WHERE {n} = (
  SELECT COUNT(*) 
  FROM image_tags AS t 
  WHERE t.image_id = i.image_id
    AND t.tag IN ({tag1}, {tag2}, ... {tagn})
)

Schema:

架构:

CREATE TABLE images (
  image_id varchar NOT NULL,
  PRIMARY KEY (image_id)
)

CREATE TABLE image_tags (
  image_id varchar NOT NULL,
  tag varchar NOT NULL,
  PRIMARY KEY (image_id, tag)
)

回答by Matt Williamson

You can make a tagstable which is just an idand tagwith a unique constraint on tagand then photo_tagstable which has tag_idand photo_id. Insert a tag into the tagstable only if it doesn't already exist.

您可以制作一个tags表,它只是一个idand 并且tag具有唯一约束tag,然后photo_tags是具有tag_idand 的表photo_idtags仅当标签不存在时才将标签插入表中。

Then you will be querying by a pk instead of varchar text comparison when doing queries like how many photos are tagged with a certain tag.

然后,在执行诸如使用某个标签标记了多少张照片之类的查询时,您将通过 pk 而不是 varchar 文本比较进行查询。