MySQL 如何设计电影数据库?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/490464/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to design a movie database?
提问by Keith Donegan
I'm trying to get my head round this mind boggling stuff they call Database Design without much success, so I'll try to illustrate my problem with an example.
我试图让我的头脑绕过他们称之为数据库设计的令人难以置信的东西,但没有取得多大成功,所以我将尝试用一个例子来说明我的问题。
I am using MySQL and here is my question:
我正在使用 MySQL,这是我的问题:
Say I want to create a database to hold my DVD collection. I have the following information that I want to include:
假设我想创建一个数据库来保存我的 DVD 收藏。我想包含以下信息:
- Film Title
- Actors
- Running Time
- Genre
- Description
- Year
- Director
- 片名
- 演员
- 运行时间
- 类型
- 描述
- 年
- 导向器
I would like to create relationships between these to make it more efficient but don't know how.
我想在这些之间建立关系以提高效率,但不知道如何。
Here is what I'm thinking for the database design:
这是我对数据库设计的想法:
Films Table => filmid, filmtitle, runningtime, description
电影表 => 电影 ID、电影标题、运行时间、描述
Year Table => year
年表 => 年
Genre Table => genre
流派表 => 流派
Director Table => director
导演表 => 导演
Actors Table => actor_name
演员表 => actor_name
But, how would I go about creating relationships between these tables?
但是,我将如何在这些表之间创建关系?
Also, I have created a unique ID for the Films Table with a primary key that automatically increments, do I need to create a unique ID for each table?
另外,我为 Films 表创建了一个唯一的 ID,主键会自动递增,我是否需要为每个表创建一个唯一的 ID?
And finally if I were to update a new film into the database through a PHP form, how would I insert all of this data in (with the relationships and all?)
最后,如果我要通过 PHP 表单将新电影更新到数据库中,我将如何将所有这些数据插入(以及所有关系?)
thanks for any help you can give, Keith
感谢您提供的任何帮助,基思
回答by gregmac
You have to make a distinction between attributes and entities. An entity is a thing - usually a noun. An attribute is more like a piece of describing information. In database jargon, entity = table, attribute = field/column.
您必须区分属性和实体。一个实体是一个东西——通常是一个名词。一个属性更像是一条描述信息。在数据库术语中,实体 = 表,属性 = 字段/列。
Having a separate table for certain things, let's use director, as an example, is called normalizing. While it can be good in some circumstances, it can be unnecessary in others (as generally it makes queries more complicated - you have to join everything - and it is slower).
对于某些事情有一个单独的表,让我们使用director为例,称为规范化。虽然它在某些情况下可能很好,但在其他情况下可能是不必要的(因为它通常会使查询更加复杂 - 您必须加入所有内容 - 而且速度较慢)。
In this case, having a year table is unnecessary, since there are no other attributes about a year, besides the year itself, that you would store. It is better to denormalize this and store the year in the film table itself.
在这种情况下,没有年份表是不必要的,因为除了年份本身之外,没有关于年份的其他属性可以存储。最好对此进行非规范化并将年份存储在电影表本身中。
Director, on the other hand, is different. Perhaps you'll want to store the director's first name, last name, date of birth, date of death (if applicable), etc. You obviously don't want to enter the director's birth date every time you enter a film that this person directs, so it makes sense to have a separate entity for a director.
而导演则不同。也许您想存储导演的名字、姓氏、出生日期、死亡日期(如果适用)等。您显然不想在每次输入此人的电影时都输入导演的出生日期董事,因此为董事设立一个单独的实体是有意义的。
Even if you didn't want to store all this information about the director (you just want their name), having a separate table for it (and using a surrogate key - I'll get to that in a second) is useful because it prevents typographic errors and duplicates - if you have someone's name spelled wrong or entered differently (first,last vs last,first), then if you try to find other movies they've directed, you'll fail.
即使您不想存储有关导演的所有这些信息(您只想要他们的名字),为它设置一个单独的表(并使用代理键 - 我稍后会讲到)也很有用,因为它防止排版错误和重复 - 如果某人的名字拼写错误或输入不同(第一个,最后一个与最后一个,第一个),那么如果您尝试查找他们导演的其他电影,您将失败。
Using a surrogate key (primary key) for tables is generally a good idea. Matching an integer is much faster than matching a string. It also allows you to freely change the name, without worrying about the foreign keys stored in other tables (the ID stays the same, so you don't have to do anything).
对表使用代理键(主键)通常是个好主意。匹配整数比匹配字符串快得多。它还允许您自由更改名称,而不必担心存储在其他表中的外键(ID 保持不变,因此您无需执行任何操作)。
You can really take this design quite far, and it's all a matter of figuring out what you want to be able to store in it.
你真的可以把这个设计走得更远,这只是弄清楚你想要能够在其中存储什么的问题。
For example, rather than have a single director per film, some films have multiple directors.. so there would be a many-to-many relationship between films and directors, so you'd need a table with eg:
例如,不是每部电影只有一个导演,有些电影有多个导演..所以电影和导演之间会有多对多的关系,所以你需要一个表格,例如:
films_directors => **filmid, directorid**
Taking it a step further, sometimes directors are also actors, and vice-versa. So rather than even have director and actor tables, you could have a single person table, and join that table in using a role table. The role table would hold various positions - eg, director, producer, star, extra, grip, editor.. and it would look more like:
更进一步,有时导演也是演员,反之亦然。因此,您甚至可以拥有一个单独的人员表,并使用角色表加入该表,而不是拥有导演和演员表。角色表将持有不同的职位——例如,导演、制片人、明星、临时演员、抓地力、编辑……它看起来更像是:
films => **filmid**, title, otherstuff...
people => **personid**, name, ....
roles => **roleid**, role name, ....
film_people => **filmid, personid, roleid**
genre => **genreid**, name, ...
film_genre => **genreid, filmid**
You might also have a role_details field in the film_people table, which could contain extra information depending on the role (eg, the name of the part the actor is playing).
您可能在film_people 表中还有一个role_details 字段,它可能包含取决于角色的额外信息(例如,演员正在扮演的角色的名称)。
I'm also showing genre as a many<>many relationship, because possible a film is in multiple genres. If you didn't want this, then instead of the film_genre table, films would just contain a genreid.
我还将类型显示为多<>多关系,因为一部电影可能有多种类型。如果您不想要这个,那么电影将只包含一个流派 ID,而不是 film_genre 表。
Once this is set up, it is easy to query and find everything a given person has done, or everything a person has done as a director, or everyone who has ever directed a movie, or all the people involved with one specific movie.. It can go on and on.
设置完成后,就可以轻松查询和查找给定人所做的一切,或者一个人作为导演所做的一切,或者曾经导演过一部电影的每个人,或者与一部特定电影有关的所有人员。它可以继续下去。
回答by Matt Howell
What follows is not actual MySQL code. It seems like what you need is more of a conceptual start here. So here's a model of what your database should look like.
以下不是实际的 MySQL 代码。在这里,您似乎更需要一个概念上的开始。所以这是您的数据库应该是什么样子的模型。
Actor table
演员表
- id (primary key)
- first name
- last name
- etc. (any additional columns you want to store on an actor)
- id(主键)
- 名
- 姓
- 等(您要存储在演员上的任何其他列)
Director table
导演桌
- id
- first name
- last name
- etc.
- ID
- 名
- 姓
- 等等。
Genre table
流派表
- id
- name
- etc.
- ID
- 姓名
- 等等。
Film table
电影台
- id
- title
- description
- running time
- release date
- director id -- this is a foreign key that refers to the id (the primary key) of the director who directed the film
- genre id -- like the director id, this refers to the id of the genre the film belongs to
- ID
- 标题
- 描述
- 运行时间
- 发布日期
- 导演 id -- 这是一个外键,指的是导演电影的导演的 id(主键)
- 流派ID——就像导演ID,这是指电影所属流派的ID
Actor-film index table
演员-电影索引表
- film id -- this is a foreign key that refers to the id of the film
- actor id -- this is a foreign key that refers to the id of one actor in the film.
- 电影ID——这是一个外键,指的是电影的ID
- 演员 ID——这是一个外键,指的是电影中一个演员的 ID。
For each actor in the film, you would add a row to the Actor-Film Index. So, if actors 5 and 13 (the primary keys for those actors) starred in film 4 (again, the primary key for that film), you'd have two rows reflecting that fact in your index: One with film id = 4, and actor id = 5, and another with film id = 4, and actor id = 13.
对于电影中的每个演员,您需要在演员-电影索引中添加一行。因此,如果演员 5 和 13(这些演员的主键)在电影 4 中出演(同样是该电影的主键),那么您的索引中将有两行反映该事实:其中电影 id = 4,演员 id = 5,另一个电影 id = 4,演员 id = 13。
Hope that helps.
希望有帮助。
Also, this assumes that each film has exactly one director. If any film in your library has two directors (such as Slumdog Millionaire), you'd want to separate out the director id from the film table, and create a Director-Film index like the Actor-Film Index as above.
此外,这假设每部电影只有一位导演。如果您库中的任何电影有两个导演(例如贫民窟的百万富翁),您需要从电影表中分离出导演 id,并创建一个导演-电影索引,如上面的演员-电影索引。
回答by nickf
These are the tables I'd use:
这些是我会使用的表:
films (_id_, title, runningtime, description)
genres (_id_, name)
people (_id_, name, birthdate, etc...)
roles (_roleid_, rolename)
filmgenres (_filmid_, _genreid_)
castandcrew (_filmid_, _roleid_, _personid_)
Instead of having a directors and actors table, just have one table of people. This can also include crew members (in case you want to track who the 2nd Junior Assistant Dolly Grip was). Each movie can be any number of genres (comedy and horror, for example). Plus, the people can take any number of roles on each film - there are quite a number of actor/directors out there.
没有一张导演和演员的桌子,只有一张人桌。这也可以包括工作人员(如果您想跟踪第二个初级助理 Dolly Grip 是谁)。每部电影可以是任意数量的类型(例如喜剧和恐怖片)。另外,人们可以在每部电影中扮演任意数量的角色——那里有很多演员/导演。
The Roles table doesn't necessarily mean the character the actor is playing, but it could. It could be "Director", "Producer", "Actor"... or even "Luke Skywalker" if you wanted to get that finely-grained... I believe IMDB does that.
Roles 表不一定表示演员扮演的角色,但可以表示。它可以是“导演”、“制片人”、“演员”……甚至是“卢克·天行者”,如果你想把它细化……我相信 IMDB 会这样做。
Hopefully the names of the fields above should hint at the foreign keys, and i've put _underscores_
around the primary keys I'd use.
希望上述字段的名称应该暗示外键,并且我已经放置_underscores_
了我将使用的主键。
回答by thursdaysgeek
Your Films table also needs links to the genre, director, and actors tables. Since the actors, at least will be many to many (one film will list more than one actor, one actor will be in more than one film), you'll need a table to link them.
您的 Films 表还需要指向流派、导演和演员表的链接。由于演员至少是多对多的(一部电影将列出多个演员,一个演员将出演多部电影),因此您需要一个表格来链接它们。
Films Table => filmid, filmtitle, runningtime, description, genreid, directorid
Genre Table => genreid, genre
Director Table => directorid, director
Actors Table => actorid,actor_name
FilmActor link table => actorid, filmid (with a record linking each actor to each film)
Any table that might be many to many needs a linking table.
任何可能是多对多的表都需要一个链接表。
回答by Mark Brackett
I have created a unique ID for the Films Table with a primary key that automatically increments, do I need to create a unique ID for each table?
我已经为 Films 表创建了一个唯一的 ID,主键会自动递增,我是否需要为每个表创建一个唯一的 ID?
Yes, each table musthave a unique id. But, that's not necessarily the primary auto incrementing key - it's whatever makes that particular instance unique. For instance, for movies, I think it's common to be title + year of release - though you'd want to check with a movie buff (a domain expert) to be sure of that. The auto increment is a fallback - basically, when you really don't have anything else to uniqueify on.
是的,每个表都必须有一个唯一的 ID。但是,这不一定是主要的自动递增键 - 它是使该特定实例独一无二的任何东西。例如,对于电影,我认为标题 + 发行年份是很常见的 - 尽管您想与电影爱好者(领域专家)核对以确保这一点。自动增量是一种后备——基本上,当你真的没有任何其他东西可以唯一化时。
You may use an auto increment key for ease of use in joins and such, but you should have a unique constraint on the uniqueness fields anyway.
您可以使用自动增量键以便于在连接等中使用,但无论如何您应该对唯一性字段有唯一约束。
As for the actual design, I'd suggest something like:
至于实际的设计,我建议如下:
Films => Primary Key(filmid), Unique Constraint(filmtitle, year),
runningtime, description,
Foreign Key(Genre), Foreign Key(DirectorId)
Genre Table => Primary Key(Genre)
Director Table => Primary Key(DirectorId), DirectorName
Actors Table => Primary Key(ActorId), ActorName
Films_Actors => Primary Key(Foreign Key(ActorId), Foreign Key(FilmId))
For the insert, well - frankly, it's a PITA. You need to insert in reverse order (and this is where auto increment keys can be an even bigger PITA - if you can add date of birth or something into the Actors and Directors table, then a unique constraint can make it easier).
对于插入物,坦率地说,它是一个 PITA。您需要以相反的顺序插入(这就是自动增量键可以成为更大 PITA 的地方 - 如果您可以将出生日期或其他内容添加到 Actors 和 Director 表中,那么唯一约束可以使其更容易)。
So, you'd insert Actor(s), Director, Film, and then Films_Actors. Ideally, all in a single transaction. Also, I assume Genre is already filled in, and is a select list - so it doesn't need to be inserted.
因此,您需要插入 Actor(s)、Director、Film 和 Films_Actors。理想情况下,所有这些都在一个事务中。另外,我假设 Genre 已经填写,并且是一个选择列表 - 所以它不需要插入。
回答by mmcdole
I realize your question has already been answered, however I wanted to point you to:
http://www.imdb.com/interfaces
我意识到你的问题已经得到了回答,但我想给你指出:http:
//www.imdb.com/interfaces
IMDB provides flat-text files of their database (minus primary keys). You might find this useful to populate your database once you get going, or you could use it in your program/website to allow you to simply search for a movies title to add to your "DVD Collection", and have the rest of the information pulled from these.
IMDB 提供其数据库的纯文本文件(减去主键)。您可能会发现这有助于在您开始后填充您的数据库,或者您可以在您的程序/网站中使用它,让您只需搜索电影标题即可添加到您的“DVD 收藏”,并获得其余信息从这些中拉出来。
回答by leancz
Sometimes actors are directors and vice versa, maybe you want a "people" table?
有时演员是导演,反之亦然,也许您想要一张“人物”表?
回答by Dean Rather
You don't really need a YearTable, and all you need is a genre_id, director_id, and actor_id columns in your films table.
您实际上并不需要 YearTable,您需要的只是电影表中的流派_id、director_id 和 actor_id 列。
Also, your genre, director, and actor tables need their own unique IDs.
此外,您的流派、导演和演员表需要它们自己的唯一 ID。
Edit:This is, of course, assuming that you're only going to have 1 genre, director, and actorfor each movie. Which probably isn't the case.
编辑:当然,这是假设每部电影只有 1 个类型、导演和演员。情况可能并非如此。
To have many actors belonging to many movies, you will need a seperate relations table. You could call it "moviesActors" (or actorsMovies) and each row will have an actor_id and a movie_id to say this actorwas in this movie.
要拥有属于许多电影的许多演员,您将需要一个单独的关系表。你可以称它为“moviesActors”(或actorsMovies),每一行都有一个actor_id 和一个movie_id 来表示这个演员在这部电影中。
回答by Cade Roux
Every table should have a primary key which is unique.
每个表都应该有一个唯一的主键。
You should read upon database normalization.
A year table is probably unnecessary.
年份表可能是不必要的。
If it's year of release, say, then the year can be stored in the film.
例如,如果是上映年份,则可以将年份存储在电影中。
If there are multiple directors on a film, then you would have a separate table which would hold the primary key of the film table and the director table. Similarly for any of the foreign key constraints which are many-to-one or many-to-many. In particular, I believe this would apply to the Actor.
如果一部电影有多个导演,那么您将有一个单独的表,该表将保存电影表和导演表的主键。对于多对一或多对多的任何外键约束,类似。特别是,我相信这将适用于 Actor。