用虚拟数据填充 SQL 表的最快方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2899654/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Quickest way to fill SQL Table with Dummy Data
提问by Alex
What is the quickest way to fill a SQL table with dummy data?
用虚拟数据填充 SQL 表的最快方法是什么?
I have a wide table with about 40 fields of different kinds (int, bit, varchar, etc.) and need to do some performance testing. I'm using SQL Server 2008.
我有一个包含大约 40 个不同类型(int、bit、varchar 等)字段的宽表,需要进行一些性能测试。我正在使用 SQL Server 2008。
Thank you!
谢谢!
采纳答案by SQLMenace
SQL Data Generatorby RedGate
RedGate 的SQL 数据生成器
Data generation in one click
一键生成数据
Realistic data based on column and table name
基于列和表名的真实数据
Data can be customized if desired
如果需要,可以自定义数据
Eliminates hours of tedious work
消除繁琐的工作时间
Full support for SQL Server 2008
完全支持 SQL Server 2008
回答by wittrup
Recommend the free, GNU-licensed, random custom data generator http://www.generatedata.com/
推荐免费的、GNU 许可的、随机的自定义数据生成器http://www.generatedata.com/
回答by TodorS
Late answer but can be useful to other readers of this thread. Beside other solutions, I can recommend importing data from a .csv file using SSMS or custom SQL import scripts, programs. There is a step-by-step tutorial on how to do this, so you might want to check it out: http://solutioncenter.apexsql.com/how-to-generate-randomized-test-data-from-a-csv-file/
迟到的答案,但可能对本主题的其他读者有用。除了其他解决方案,我还建议使用 SSMS 或自定义 SQL 导入脚本、程序从 .csv 文件导入数据。有一个关于如何执行此操作的分步教程,因此您可能需要查看:http: //solutioncenter.apexsql.com/how-to-generate-randomized-test-data-from-a- csv文件/
Be aware that importing a .csv file using SSMS or custom SQL import scripts, is easier than creating SQL inserts manually, but there are some limitations, as explained in the tutorial:
请注意,使用 SSMS 或自定义 SQL 导入脚本导入 .csv 文件比手动创建 SQL 插入更容易,但存在一些限制,如教程中所述:
If there is a need for thousands of rows to be populated, and the .csv file contains few hundred rows of data it is just not enough. The workaround is reimporting the same .csv file over and over until needed. The drawback to this method is that it will insert large blocks of rows with the same data, without randomizing them.
如果需要填充数千行,而 .csv 文件包含几百行数据,这还不够。解决方法是一遍又一遍地重新导入相同的 .csv 文件,直到需要为止。这种方法的缺点是它会插入具有相同数据的大块行,而不会随机化它们。
The tutorial also explains how to use a 3rd party SQL data generator called ApexSQL Generate. The tool has an integrated function to generate large amounts of randomized data from the imported .csv formatted file. Application features a fully functional free trial so you can download and try it to see if it works for you.
本教程还解释了如何使用名为ApexSQL Generate的 3rd 方 SQL 数据生成器。该工具具有从导入的 .csv 格式文件生成大量随机数据的集成功能。应用程序具有功能齐全的免费试用版,因此您可以下载并试用它,看看它是否适合您。
回答by Mohammad Musavi
You Only need Go 1000
after your INSERT, to fill it 1000 times, just like this:
你只需要Go 1000
在你的 INSERT 之后填充 1000 次,就像这样:
INSERT INTO dbo.Cusomers(Id, FirstName, LastName) VALUES(1, 'Mohamed', 'Mousavi')
GO 1000
It will make a table with 1000 same rows in it.
它将创建一个包含 1000 行相同行的表。
Another solution is that you can populate the beginning rows of your table with some data, then you fill the next rows of table by repeating the beginning rows over and over, it means you fill your table by itself:
另一种解决方案是您可以用一些数据填充表格的开始行,然后通过一遍又一遍地重复开始行来填充表格的下一行,这意味着您自己填充表格:
INSERT INTO dbo.Customers
SELECT * FROM dbo.Customers
GO 10
In the case one or more column are identity (meaning they accept unique values, if it's auto incremental), you just don't place it in your query, for instance if Id in dbo.Customer is identity, the query goes like this:
如果一列或多列是身份(意味着它们接受唯一值,如果它是自动增量的),您只是不要将它放在您的查询中,例如,如果 dbo.Customer 中的 Id 是身份,则查询如下所示:
INSERT INTO dbo.Customers
SELECT FirstName, Last Name FROM dbo.Customers
GO 10
Instead Of:
代替:
INSERT INTO dbo.Customers
SELECT Id, FirstName, Last Name FROM dbo.Customers
GO 10
Else you'll encounter this Error:
否则你会遇到这个错误:
An explicit value for the identity column in table 'dbo.Customers' can only be specified when a column list is used and IDENTITY_INSERT is ON.
An explicit value for the identity column in table 'dbo.Customers' can only be specified when a column list is used and IDENTITY_INSERT is ON.
Note:
This is sort of an arithmetic progression, so it gonna last a little, don't use a big number in front of GO
.
注意:这是一个等差数列,所以它会持续一点,不要在GO
.
If you want to have a table which is filled a little bit more elaborated then you can achieve that the same way this time by executing a simple query and following these steps:
如果您想要一个填充得更详细的表,那么这次您可以通过执行一个简单的查询并按照以下步骤以同样的方式实现:
Choose one of your tables which has a remarkable number of rows, say dbo.Customers
Right click on it and select
Script Table as > Create To > New Query Editor Window
Name your new table to something else like dbo.CustomersTest, Now you can execute the query to have a new table with similar structure with the dbo.Customers.
选择一个行数非常多的表,比如 dbo.Customers
右键单击它并选择
Script Table as > Create To > New Query Editor Window
将新表命名为其他名称,例如 dbo.CustomersTest,现在您可以执行查询以拥有一个与 dbo.Customers 具有相似结构的新表。
Note:Keep in mind that if it has a Identity filed, change it's Identity Specification to No
Since you are supposed to fill the new table by the data of the original one repeatedly.
注意:请记住,如果它有一个 IdentityIdentity Specification to No
字段,请更改它,因为您应该重复使用原始数据填充新表。
- Run the following query, it's going to be run 1000 times, you can change it to more or less but be aware that it might last minuets based on your computer hardware:
- 运行以下查询,它将运行 1000 次,您可以将其更改为更多或更少,但请注意,根据您的计算机硬件,它可能会持续:
INSERT INTO [dbo].[CustomersTest]
SELECT * FROM [dbo].[Customers]
GO 1000
INSERT INTO [dbo].[CustomersTest]
SELECT * FROM [dbo].[Customers]
GO 1000
- After a while you have a table with dummy rows in it!
- 过了一会儿,你就有了一个包含虚拟行的表!
As @SQLMenace mentioned, RedGate Data Generator is a so good tool to fulfill it, it costs $369, you have a 14 days trial chance Although.
正如@SQLMenace 所提到的,RedGate Data Generator 是一个很好的工具来实现它,它的成本为 369 美元,但您有 14 天的试用机会。
The good point is that RedGate identifies foreign keys so you can apply JOIN in your queries.
好处是 RedGate 识别外键,因此您可以在查询中应用 JOIN。
You have a bunch of options which allow you to decide how every column is supposed to be populated, every column is anticipated semantically so that related data are suggested, for instance if you have a column named 'Department' it isn't filled by weird characters, it's filled by expressions like "Technical", "Web", "Customer", etc. Even you can use regular expression to restrict selected characters.
您有很多选项可以让您决定如何填充每一列,从语义上预测每一列,以便建议相关数据,例如,如果您有一个名为“部门”的列,它不会被奇怪的填充字符,它由“技术”、“网络”、“客户”等表达式填充。您甚至可以使用正则表达式来限制所选字符。
I populated my tables with over 10,000,000 records which was an awesome simulation.
我用超过 10,000,000 条记录填充了我的表格,这是一个很棒的模拟。