SQL 用于练习的示例数据库
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/719259/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Sample database for exercise
提问by n00ki3
I would like to play with a larger database to test my knowledge on SQL.
我想使用更大的数据库来测试我对 SQL 的了解。
Is there a huge .sql out there that I can use to play with SQL queries?
是否有一个巨大的 .sql 可以用来玩 SQL 查询?
采纳答案by karim79
You could try the classic MySQL world database.
你可以试试经典的MySQL 世界数据库。
The world.sql file is available for download here:
world.sql 文件可在此处下载:
http://dev.mysql.com/doc/index-other.html
http://dev.mysql.com/doc/index-other.html
Just scroll down to Example Databasesand you will find it.
只需向下滚动到示例数据库,您就会找到它。
回答by iberck
This is an online database but you can try with the stackoverflow database: http://data.stackexchange.com/stackoverflow/query/new
这是一个在线数据库,但您可以尝试使用 stackoverflow 数据库:http: //data.stackexchange.com/stackoverflow/query/new
You also can download its dumps here:
你也可以在这里下载它的转储:
回答by bendewey
Check out CodePlex for Microsoft SQL Server Community Projects & Samples
查看 CodePlex for Microsoft SQL Server 社区项目和示例
3rd party edit
第三方编辑
On top of the link above you might look at
在上面的链接的顶部,您可能会看到
- microsoft sql server samples on github
- the msft db product samples on codeplex
- the new Wide World Importers sample databaseinludes OLTP and an OLAP for sql server 2016 and later
- github 上的 microsoft sql server 示例
- Codeplex 上的msft db 产品示例
- 新的Wide World Importers 示例数据库包含 OLTP 和用于 sql server 2016 及更高版本的 OLAP
回答by bendewey
Why not download the English Wikipedia? There are compressed SQL files of various sizes, and it should certainly be large enough for you
为什么不下载英文维基百科?有各种大小的压缩SQL文件,当然对你来说应该足够大
The main articles are XML, so inserting them into the db is a bit more of a problem, but you might find there are other files there that suit you. For example, the inter-page links SQL file is 2.3GB compressed. Have a look at https://en.wikipedia.org/wiki/Wikipedia:Database_downloadfor more info.
主要文章是 XML,因此将它们插入 db 是一个问题,但您可能会发现那里还有其他适合您的文件。例如,页面间链接 SQL 文件被压缩为 2.3GB。查看https://en.wikipedia.org/wiki/Wikipedia:Database_download了解更多信息。
Oskar
奥斯卡
回答by Eng.Fouad
This is what I am using for learning sql: employees-db
这就是我用于学习 sql 的内容:employees-db
this is a sample database with an integrated test suite, used to test your applications and database servers
这是一个带有集成测试套件的示例数据库,用于测试您的应用程序和数据库服务器
3rd party edit
第三方编辑
According to launchpad.net the database has moved to github.
根据 launchpad.net数据库已移至 github。
The database contains about 300,000 employee records with 2.8 million salary entries. The export data is 167 MB, which is not huge, but heavy enough to be non-trivial for testing.
The data was generated, and as such there are inconsistencies and subtle problems. Rather than removing them, we decided to leave the contents untouched, and use these issues as data cleaning exercises.
该数据库包含大约 300,000 条员工记录和 280 万条工资条目。导出数据为 167 MB,虽然不是很大,但足以进行测试。
数据是生成的,因此存在不一致和微妙的问题。我们没有删除它们,而是决定保留内容不变,并将这些问题用作数据清理练习。
回答by Dan Lew
If you want a big database of real data to play with, you could sign up for the Netflix Prize contestand get access to their data, which is pretty large (a few gigs of entries).
如果您想要使用大型真实数据数据库,您可以报名参加Netflix Prize 竞赛并访问他们的数据,这些数据非常大(几场演出)。
3rd party edit
第三方编辑
The URL above does not contain the dataset anylonger (october 2016). The wikipedia page about the Netflix Prizereports that a law suit was settled regarding privacy concerns.
上面的 URL 不再包含数据集(2016 年 10 月)。关于 Netflix 奖的维基百科页面报告称,一项关于隐私问题的诉讼已经解决。
回答by tpdi
You want huge?
你要大吗?
Here's a small table: create table foo (id int not null primary key auto_increment, crap char(2000));
这是一个小表:create table foo (id int not null primary key auto_increment, crap char(2000));
insert into foo(crap) values ('');
插入 foo(crap) 值 ('');
-- each time you run the next line, the number of rows in foo doubles. insert into foo( crap ) select * from foo;
-- 每次运行下一行时, foo 中的行数加倍。插入 foo(废话) select * from foo;
run it twenty more times, you have over a million rows to play with.
再运行 20 次,你就有超过一百万行可以玩了。
Yes, if he's looking for looks of relations to navigate, this is not the answer. But if by huge he means to test performance and his ability to optimize, this will do it. I did exactly this (and then updated with random values) to test an potential answer I had for another question. (And didn't answer it, because I couldn't come up with better performance than what that asker had.)
是的,如果他正在寻找关系的外观来导航,这不是答案。但是,如果他想通过巨大的手段来测试性能和优化能力,那就可以了。我正是这样做的(然后用随机值更新)来测试我对另一个问题的潜在答案。(并且没有回答,因为我想不出比提问者更好的表现。)
Had he asked for "complex", I'd have gien a differnt answer. To me,"huge" implies "lots of rows".
如果他要求“复杂”,我会有不同的答案。对我来说,“巨大”意味着“很多行”。
Because you don't need huge to play with tables and relations. Consider a table, by itself, with no nullable columns. How many different kindsof rows can there be? Only one, as all columns must have some value as none can be null.
因为你不需要很大的玩表和关系。考虑一个表,它本身没有可以为空的列。可以有多少种不同的行?只有一个,因为所有列都必须有一些值,因为没有一个可以为空。
Every nullable column multiples by two the number of different kindsof rows possible: a row where that column is null, an row where it isn't null.
每个可为空的列乘以可能的不同类型行数的两倍:该列为空的行,不为空的行。
Now consider the table, not in isolation. Consider a table that is a child table: for every child that has an FK to the parent, that, is a many-to-one, there can be 0, 1 or many children. So we multiply by three times the count we got in the previous step (no rows for zero, one for exactly one, two rows for many). For any grandparent to which the parent is a many, another three.
现在考虑表格,而不是孤立的。考虑一个作为子表的表:对于每个对父级具有 FK 的子级,即多对一,可以有 0、1 或多个子级。所以我们乘以我们在上一步中得到的计数的三倍(零没有行,正好一行,多行两行)。对于父母是多的任何祖父母,另外三个。
For many-to-many relations, we can have have no relation, a one-to-one, a one-to-many, many-to-one, or a many-to-many. So for each many-to-many we can reach in a graph from the table, we multiply the rows by nine -- or just like two one-to manys. If the many-to-many also has data, we multiply by the nullability number.
对于多对多关系,我们可以没有关系、一对一、一对多、多对一或多对多。因此,对于我们可以从表中的图表中得出的每个多对多,我们将行乘以九——或者就像两个一对多。如果多对多也有数据,我们乘以可空性数。
Tables that we can't reach in our graph -- those that we have no direct or indirect FK to, don't multiply the rows in our table.
我们在图表中无法到达的表——那些我们没有直接或间接 FK 的表,不要将表中的行相乘。
By recursively multiplying the each table we can reach, we can come up with the number of rows needed to provide one of each "kind", and we need no more than those to test every possible relation in our schema. And we're nowhere near huge.
通过递归地乘以我们可以到达的每个表,我们可以得出提供每个“种类”所需的行数,并且我们只需要那些来测试我们模式中的每一个可能的关系。而且我们离巨大还差得很远。