SQL 如何提高数据库的性能?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2007575/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to increase the performance of a Database?
提问by Vaibhav Jain
I have designed databases several times in my company. To increase the performance of the database, I look for Normalisation and Indexing only.
If you were asked to increase the performance of a database which has approx 250 tables and some tables with millions of records, what different things you would look for?
我在我的公司设计了好几次数据库。为了提高数据库的性能,我只寻找规范化和索引。
如果您被要求提高具有大约 250 个表和一些包含数百万条记录的表的数据库的性能,您会寻找哪些不同的东西?
Thanks in advance.
提前致谢。
采纳答案by ewernli
Optimize the logical design
优化逻辑设计
The logical level is about the structure of the query and tables themselves. Try to maximize this first. The goal is to access as few data as possible at the logical level.
逻辑级别是关于查询和表本身的结构。首先尝试最大化这一点。目标是在逻辑级别访问尽可能少的数据。
- Have the most efficient SQL queries
- Design a logical schema that support the application's need (e.g. type of the columns, etc.)
- Design trade-off to support some use case better than other
- Relational constraints
- Normalization
- 拥有最高效的 SQL 查询
- 设计一个支持应用程序需求的逻辑模式(例如列的类型等)
- 设计权衡以比其他用例更好地支持某些用例
- 关系约束
- 正常化
Optimize the physical design
优化物理设计
The physical level deals with non-logical consideration, such as type of indexes, parameters of the tables, etc. Goal is to optimize the IO which is always the bottleneck. Tune each table to fit it's need. Small table can be loaded permanently loaded in the DBMS cache, table with low write rate can have different settings than table with high update rate to take less disk spaces, etc. Depending on the queries, different index can be used, etc. You can denormalized data transparently with materialized views, etc.
物理层处理非逻辑考虑,如索引类型、表参数等。目标是优化一直是瓶颈的IO。调整每个表以适应它的需要。小表可以永久加载到DBMS缓存中,写入率低的表可以与更新率高的表有不同的设置以占用较少的磁盘空间等。根据查询,可以使用不同的索引等。您可以使用物化视图等透明地非规范化数据。
- Tables paremeters (allocation size, etc.)
- Indexes (combined, types, etc.)
- System-wide parameters (cache size, etc.)
- Partitioning
- Denormalization
- 表参数(分配大小等)
- 索引(组合、类型等)
- 系统范围的参数(缓存大小等)
- 分区
- 非规范化
Try first to improve the logical design, then the physical design. (The boundary between both is however vague, so we can argue about my categorization).
首先尝试改进逻辑设计,然后改进物理设计。(然而两者之间的界限是模糊的,所以我们可以争论我的分类)。
Optimize the maintenance
优化维护
Database must be operated correctly to stay as efficient as possible. This include a few mainteanance taks that can have impact on the perofrmance, e.g.
必须正确操作数据库以尽可能保持高效。这包括一些可能影响性能的维护任务,例如
- Keep statistics up to date
- Re-sequence critical tables periodically
- Disk maintenance
- All the system stuff to have a server that rocks
- 保持最新的统计数据
- 定期对关键表重新排序
- 磁盘维护
- 所有系统的东西都有一个摇滚的服务器
回答by Remus Rusanu
Compression. For the vast majority of loads I've tried, using compression was a tremendous free ride. Reduced data size means reduced I/O means better throughput. In SQL Server 2005 the compression options are limited (vardecimal
). But I would seriously consider upgrading to 2008 for page compression alone. Or 2008 R2 if you use nvarchar
frequently to get Unicode compression.
压缩。对于我尝试过的绝大多数负载,使用压缩是一次巨大的免费骑行。减少数据大小意味着减少 I/O 意味着更好的吞吐量。在 SQL Server 2005 中,压缩选项是有限的 ( vardecimal
)。但我会认真考虑升级到 2008 仅用于页面压缩。或者 2008 R2 如果您nvarchar
经常使用以获得 Unicode 压缩。
Data Retention. Establishing retention policies and deleting old data aggressively. Less data means less I/O, means better throughput. Often this is seen as operational, not design, but I like to think at this issue as an application design issue.
数据保留。建立保留策略并积极删除旧数据。更少的数据意味着更少的 I/O,意味着更好的吞吐量。通常这被视为可操作的,而不是设计的,但我喜欢将这个问题视为应用程序设计问题。
Of course, I assume you already monitor each and every query to ensure none does stupid end-to-end table scans.
当然,我假设您已经监视了每个查询以确保没有人执行愚蠢的端到端表扫描。
Many more performance boosters are mostly operational or deployment, not design: maintenance (defragmentation, index rebuild etc), I/O and storage design etc.
更多的性能助推器主要是操作或部署,而不是设计:维护(碎片整理、索引重建等)、I/O 和存储设计等。
And last but not least understand the hidden cost of various turn-key solutions. Like, say, Replication, or Database Mirroring.
最后但并非最不重要的一点是了解各种交钥匙解决方案的隐藏成本。例如,复制或数据库镜像。
回答by Aaronaught
That's a very vague question.
这是一个很模糊的问题。
You say you look for indexing, but you can't look at indexing in isolation. You have to look at the queries that are being run, the execution plans, the indexes that are being used and how they are being used. The Profiler tool can help a great deal in determining which queries are inefficient.
你说你在寻找索引,但你不能孤立地看待索引。您必须查看正在运行的查询、执行计划、正在使用的索引以及它们的使用方式。Profiler 工具可以在很大程度上帮助确定哪些查询效率低下。
Aside from that - make sure a maintenance plan is set up. You should be updating statistics and defragmenting/rebuilding indexes at least once a week in a heavy transactional database.
除此之外 - 确保制定维护计划。在繁重的事务性数据库中,您应该每周至少更新一次统计信息和碎片整理/重建索引。
If you have the infrastructure, look at your file and filegroup settings. You should try to put tables and/or indexes that are large and frequently used on different physical drives, if possible. If you have any very large tables, you might think of partitioning them.
如果您有基础设施,请查看您的文件和文件组设置。如果可能,您应该尝试将大且经常使用的表和/或索引放在不同的物理驱动器上。如果您有任何非常大的表,您可能会考虑对它们进行分区。
If you're still having performance problems, denormalizationcan sometimes help - but it all depends on the situation.
如果您仍然遇到性能问题,非规范化有时会有所帮助 - 但这一切都取决于具体情况。
I'm going to stop there - don't want this answer to become the world's most random list of SQL performance tips. I recommend you be more specific about where you think the performance issues are, and tell us a bit more about the database (size, current indexing strategy, transaction frequency, any large reports you need to generate, etc.)
我将就此打住——不希望这个答案成为世界上最随机的 SQL 性能提示列表。我建议您更具体地说明您认为性能问题在哪里,并告诉我们更多有关数据库的信息(大小、当前索引策略、事务频率、您需要生成的任何大型报告等)
回答by Eyal Z.
My roll at MySpace was "Performance Enhancement DBA/Developer". I would say that Normalization and Indexes are a requirement in high performance databases, but you must really analyze your table structures and indexes to truly unlock the powers of database design.
我在 MySpace 的工作是“Performance Enhancement DBA/Developer”。我会说规范化和索引是高性能数据库的必要条件,但您必须真正分析表结构和索引才能真正释放数据库设计的力量。
Here are a few suggestions I would have for you;
以下是我对您的一些建议;
Get to know the DB Engine. A through knowledge of the underlining I/O structure goes a very long way in designing a proper index or table. Using PerfMon and Profiler, alongside your knowledge of what Read/Write I/Os are, you can put some very specific numbers behind your theory of what is a well-formed table / index solution.
Understand the difference between Clustered and Non-Clustered indexes and when to use which.
Use sys.dm_os_waiting_tasks and the sys.dm_os_wait_stats DMVs. They will tell you where you should put your effort into reducing wait-time.
Use DBCC SET STATISTICS IO/TIME ON, and evaluate your execution plans to see if one query reduces or increases the number of page reads or duration.
DBCC SHOWCONTIG will tell you if your tables are heavily fragmented. This is often neglected by developers and Jr. DBAs from a performance point of view - however, this can have a very BIG effect on the number of page-reads you have. If a table has 20% extent page density, that means you're reading about 5 times the data you otherwise would be if the table and it's indexes were defragmented.
Evaluate dirty-reads ( nolock, read uncommited ). If you could do away with millisecond-precision on reads, save the locks!
Consider taking out unnecessary Foreign Keys. They're useful in Dev environments, not on high-performance transactional systems.
Partitions in large tables make a big difference - only if properly designed.
Application changes - If you could schedule batch updates for asynchronous transactions, put them into an index-free heap and process it on schedule so that you don't constently update the tables which you query heavily.
Always Always Always!!! use the same data type variable to query the target columns; For example, the following statement uses a bigint variable for a smallint column:
了解数据库引擎。全面了解下划线 I/O 结构对设计合适的索引或表大有帮助。使用 PerfMon 和 Profiler,除了了解读/写 I/O 是什么之外,您还可以在什么是格式良好的表/索引解决方案的理论背后添加一些非常具体的数字。
了解聚集索引和非聚集索引之间的区别以及何时使用哪个。
使用 sys.dm_os_waiting_tasks 和 sys.dm_os_wait_stats DMV。他们会告诉您应该在哪些方面努力减少等待时间。
使用 DBCC SET STATISTICS IO/TIME ON,并评估您的执行计划以查看一个查询是否减少或增加了页面读取次数或持续时间。
DBCC SHOWCONTIG 会告诉您您的表是否严重碎片化。从性能的角度来看,开发人员和 Jr. DBA 通常会忽略这一点 - 但是,这会对您拥有的页面读取次数产生非常大的影响。如果表具有 20% 的区页密度,这意味着您读取的数据大约是对表及其索引进行碎片整理时读取的数据的 5 倍。
评估脏读( nolock, read uncommited )。如果您可以取消读取的毫秒精度,请保存锁!
考虑去掉不必要的外键。它们在开发环境中很有用,而不是在高性能事务系统中。
大表中的分区会产生很大的不同——只有设计得当。
应用程序更改 - 如果您可以为异步事务安排批量更新,请将它们放入一个无索引的堆中并按计划进行处理,这样您就不会经常更新您大量查询的表。
永远永远永远!!!使用相同的数据类型变量查询目标列;例如,以下语句对 smallint 列使用 bigint 变量:
declare @i bigint set @i = 0
声明@i bigint 设置@i = 0
select * from MyTable where Col01SmallInt >= @i
select * from MyTable where Col01SmallInt >= @i
In the process of evaluating index / table pages, the query engine may opt to convert your smallint column data to bigint data type. Consider instead, changing your varialbe type, or at-least converting it to smallint in your search condition.
在评估索引/表页的过程中,查询引擎可能会选择将您的 smallint 列数据转换为 bigint 数据类型。相反,请考虑更改您的 varialbe 类型,或者至少在您的搜索条件中将其转换为 smallint。
- SQL 2005/08 gives you "Reports" in the Management Application, take a look at reports on how your indexes are performing. Are they being Scanned, Seeked? when was your last Table Scan? If it was recent, you indexes are not fulfilling all necessary queries. If you have an index that is hardly being used (seeked or scaned) but is constantly being updated, consider dropping it.. It may save you a lot of unnecessary row-locks and key-locks. ..
- SQL 2005/08 在管理应用程序中为您提供“报告”,查看有关索引执行情况的报告。他们是否正在被扫描、寻找?你上一次表扫描是什么时候?如果它是最近的,则您的索引没有完成所有必要的查询。如果您有一个几乎没有被使用(搜索或扫描)但不断更新的索引,请考虑删除它。它可能会为您节省很多不必要的行锁和键锁。..
That's all I can think of off the top of my head. If you run into a more specific problem, I would have a more specific answer for you..
这就是我能想到的一切。如果您遇到更具体的问题,我会为您提供更具体的答案。
回答by David M
To your toolkit of normalisation and indexing, with extremely large tables you may also want to consider the pros and cons of partioning the tables. But you've got the key ones there already.
对于您的规范化和索引工具包,对于非常大的表,您可能还需要考虑对表进行分区的利弊。但是你已经得到了关键的那些。
回答by Mike Mytkowski
There are many things you could do, a lot of them already suggested above. Some that I would look at (in this order):
你可以做很多事情,其中很多已经在上面建议了。我会看的一些(按此顺序):
- Errors/logs - many db engines have reporting tools that point out problem areas in a database. Start here to see if there's anything you can focus on right away.
- Data retention - check business specification how long data should be kept for, make sure any older data is moved off to a data warehouse to keep table size small. (Why keep 5 years of data if only need last 3 months?)
- Look for table scans, index the data if it will help (you have to gauge this one against table writes). Your server logs can probably help you with finding table scans.
- Atomic elements of work, are some writes keeping locks on different tables before a commit point is reached? Can those elements of work be simplified or commit points moved to speed up performance? This is where you will need a developer to look at the code.
- Look for long running SQL statements, can it be made more efficient? Sometimes poorly structured queries can really bog an application down. You may need to suggest a coding change to improve performance.
- dba realm: look at how tables are allocated: page size, multiple segments etc. This is where diagnostics tools from the vendor come in handy, as they can often suggest how you can structure a table based on usage history. An experienced dba would be useful here.
- look for hardware/network bottlenecks. This is where you would need a hardware guy. :)
- 错误/日志 - 许多数据库引擎都有报告工具,可以指出数据库中的问题区域。从这里开始,看看是否有任何可以立即关注的事情。
- 数据保留 - 检查业务规范数据应保留多长时间,确保将任何旧数据移至数据仓库以保持表大小较小。(如果只需要最后 3 个月,为什么要保留 5 年的数据?)
- 查找表扫描,索引数据是否有帮助(您必须根据表写入来衡量这一点)。您的服务器日志可能可以帮助您查找表扫描。
- 工作的原子元素,在到达提交点之前,某些写入是否在不同的表上保持锁定?是否可以简化这些工作元素或移动提交点以提高性能?这是您需要开发人员查看代码的地方。
- 寻找长时间运行的 SQL 语句,是否可以提高效率?有时,结构不佳的查询确实会使应用程序陷入困境。您可能需要建议更改编码以提高性能。
- dba 领域:查看表的分配方式:页面大小、多个段等。这是供应商提供的诊断工具派上用场的地方,因为它们通常可以建议您如何根据使用历史构建表。有经验的 dba 在这里会很有用。
- 寻找硬件/网络瓶颈。这就是您需要硬件人员的地方。:)
These are really high level, I would also take a look at what the vendor of your db engine suggests as performance improvements.
这些确实是高水平的,我还会看看您的数据库引擎供应商对性能改进的建议。
Also, I would gauge a list like this against what my boss is willing to pay for and how much time I have. ;)
此外,我会根据我的老板愿意支付的费用以及我有多少时间来衡量这样的清单。;)
Hope this helps.
希望这可以帮助。
回答by BlueRaja - Danny Pflughoeft
If a query is extremely mission-critical, you may want to consider de-normalizing, to reduce the number of table-lookups per query. Aside from that, if you need more performance beyond what indexing and de-normalizing can perform, you might want to look program-side: caching, optimizing queries/stored-procedures, etc.
如果查询是非常关键的,你可能要考虑去-normalizing,以减少每个查询表查找的数量。除此之外,如果您需要超出索引和反规范化可以执行的性能,您可能需要查看程序端:缓存、优化查询/存储过程等。
回答by Giorgi
In order to increase performance you will need to monitor your database first. You can trace and then load it in sql server profiler to find out which are the slowest queries. After that you can concentrate on them.
为了提高性能,您首先需要监视数据库。您可以跟踪然后将其加载到 sql server profiler 中以找出最慢的查询。之后,您可以专注于它们。
You can also use dynamic views and management function to find out which indexes are missing. You will also be able to retrieve statistics about existing indexes such as index usage and missed indexes.
您还可以使用动态视图和管理功能来找出缺少哪些索引。您还可以检索有关现有索引的统计信息,例如索引使用情况和丢失的索引。
回答by Andrey Adamovich
Optimizing the queries that are used to access that database is most important. Just by adding indexes you don't guarantee that queries will use them.
优化用于访问该数据库的查询是最重要的。仅仅通过添加索引你不能保证查询会使用它们。
回答by Chip Uni
We haven't written about one performance bit:
我们还没有写过一个性能位:
Hardware.
硬件。
Databases are intensely I/O driven. Moving to a faster hard drive should increase the speed of database queries. Splitting the database among many fast hard drives might improve it even more.
数据库是强烈的 I/O 驱动的。移动到更快的硬盘驱动器应该会提高数据库查询的速度。在许多快速硬盘驱动器之间拆分数据库可能会进一步改善它。