在性能开始下降之前 MySQL 数据库可以有多大
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1276/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How big can a MySQL database get before performance starts to degrade
提问by Grant
At what point does a MySQL database start to lose performance?
MySQL 数据库什么时候开始失去性能?
- Does physical database size matter?
- Do number of records matter?
- Is any performance degradation linear or exponential?
- 物理数据库大小重要吗?
- 记录数重要吗?
- 性能下降是线性的还是指数的?
I have what I believe to be a large database, with roughly 15M records which take up almost 2GB. Based on these numbers, is there any incentive for me to clean the data out, or am I safe to allow it to continue scaling for a few more years?
我有我认为的大型数据库,大约有 1500 万条记录,占用了近 2GB。根据这些数字,是否有任何动机让我清理数据,或者我是否可以安全地让它继续扩展几年?
采纳答案by Nick Berardi
The physical database size doesn't matter. The number of records don't matter.
物理数据库大小无关紧要。记录的数量无关紧要。
In my experience the biggest problem that you are going to run in to is not size, but the number of queries you can handle at a time. Most likely you are going to have to move to a master/slave configuration so that the read queries can run against the slaves and the write queries run against the master. However if you are not ready for this yet, you can always tweak your indexes for the queries you are running to speed up the response times. Also there is a lot of tweaking you can do to the network stack and kernel in Linux that will help.
根据我的经验,您将遇到的最大问题不是大小,而是您一次可以处理的查询数量。很可能您将不得不转向主/从配置,以便读取查询可以针对从服务器运行,而写查询可以针对主服务器运行。但是,如果您还没有准备好,您可以随时调整您正在运行的查询的索引以加快响应时间。此外,您可以对 Linux 中的网络堆栈和内核进行大量调整,这将有所帮助。
I have had mine get up to 10GB, with only a moderate number of connections and it handled the requests just fine.
我的容量高达 10GB,只有中等数量的连接,它可以很好地处理请求。
I would focus first on your indexes, then have a server admin look at your OS, and if all that doesn't help it might be time to implement a master/slave configuration.
我会首先关注您的索引,然后让服务器管理员查看您的操作系统,如果所有这些都无济于事,那么可能是时候实施主/从配置了。
回答by dlinsin
In general this is a very subtle issue and not trivial whatsoever. I encourage you to read mysqlperformanceblog.comand High Performance MySQL. I really think there is no general answer for this.
一般来说,这是一个非常微妙的问题,并非微不足道。我鼓励您阅读mysqlperformanceblog.com和High Performance MySQL。我真的认为对此没有普遍的答案。
I'm working on a project which has a MySQL database with almost 1TB of data. The most important scalability factor is RAM. If the indexes of your tables fit into memory and your queries are highly optimized, you can serve a reasonable amount of requests with a average machine.
我正在开发一个项目,该项目有一个包含近 1TB 数据的 MySQL 数据库。最重要的可扩展性因素是 RAM。如果您的表的索引适合内存并且您的查询是高度优化的,那么您可以使用普通机器处理合理数量的请求。
The number of records do matter, depending of how your tables look like. It's a difference to have a lot of varchar fields or only a couple of ints or longs.
记录的数量确实很重要,这取决于您的表的外观。有很多 varchar 字段或只有几个 int 或 long 是不同的。
The physical size of the database matters as well: think of backups, for instance. Depending on your engine, your physical db files on grow, but don't shrink, for instance with innodb. So deleting a lot of rows, doesn't help to shrink your physical files.
数据库的物理大小也很重要:例如,考虑备份。根据您的引擎,您的物理数据库文件会增长,但不会缩小,例如使用 innodb。因此,删除大量行无助于缩小您的物理文件。
There's a lot to this issues and as in a lot of cases the devil is in the details.
这个问题有很多,在很多情况下,细节决定成败。
回答by 0x4a6f4672
The database size does matter. If you have more than one table with more than a million records, then performance starts indeed to degrade. The number of records does of course affect the performance: MySQL can be slow with large tables. If you hit one million records you will get performance problems if the indices are not set right (for example no indices for fields in "WHERE statements" or "ON conditions" in joins). If you hit 10 million records, you will start to get performance problems even if you have all your indices right. Hardware upgrades - adding more memory and more processor power, especially memory - often help to reduce the most severe problems by increasing the performance again, at least to a certain degree. For example 37 signals went from 32 GB RAM to 128GB of RAMfor the Basecamp database server.
数据库大小确实很重要。如果您有多个表的记录超过一百万,那么性能确实开始下降。记录的数量当然会影响性能:MySQL 对于大表可能会很慢。如果您达到了 100 万条记录,如果索引设置不正确(例如,“WHERE 语句”中的字段或连接中的“ON 条件”中的字段没有索引),您将遇到性能问题。如果您达到 1000 万条记录,即使您的所有索引都正确,您也会开始遇到性能问题。硬件升级——增加更多内存和更多处理器能力,尤其是内存——通常有助于通过再次提高性能(至少在一定程度上)来减少最严重的问题。例如Basecamp 数据库服务器的37 个信号从 32 GB RAM 变为 128 GB RAM。
回答by BlaM
I would focus first on your indexes, than have a server admin look at your OS, and if all that doesn't help it might be time for a master/slave configuration.
我会首先关注您的索引,而不是让服务器管理员查看您的操作系统,如果所有这些都无济于事,那么可能是时候进行主/从配置了。
That's true. Another thing that usually works is to just reduce the quantity of data that's repeatedly worked with. If you have "old data" and "new data" and 99% of your queries work with new data, just move all the old data to another table - and don't look at it ;)
确实如此。通常有效的另一件事是减少重复使用的数据量。如果您有“旧数据”和“新数据”,并且 99% 的查询都使用新数据,只需将所有旧数据移动到另一个表 - 不要查看它;)
-> Have a look at partitioning.
-> 看看分区。
回答by ian
2GB and about 15M records is a very small database - I've run much bigger ones on a pentium III(!) and everything has still run pretty fast.. If yours is slow it is a database/application design problem, not a mysql one.
2GB 和大约 15M 的记录是一个非常小的数据库 - 我在奔腾 III(!)上运行了更大的数据库,一切仍然运行得非常快..如果你的速度很慢,那是数据库/应用程序设计问题,而不是 mysql一。
回答by deadprogrammer
It's kind of pointless to talk about "database performance", "query performance" is a better term here. And the answer is: it depends on the query, data that it operates on, indexes, hardware, etc. You can get an idea of how many rows are going to be scanned and what indexes are going to be used with EXPLAIN syntax.
谈论“数据库性能”是毫无意义的,“查询性能”在这里是一个更好的术语。答案是:它取决于查询、它操作的数据、索引、硬件等。您可以了解将要扫描的行数以及将使用 EXPLAIN 语法的索引。
2GB does not really count as a "large" database - it's more of a medium size.
2GB 并不是真正的“大型”数据库——它更像是一个中等大小的数据库。
回答by Rich Remer
I'm currently managing a MySQL database on Amazon's cloud infrastructure that has grown to 160 GB. Query performance is fine. What has become a nightmare is backups, restores, adding slaves, or anything else that deals with the whole dataset, or even DDL on large tables. Getting a clean import of a dump file has become problematic. In order to make the process stable enough to automate, various choices needed to be made to prioritize stability over performance. If we ever had to recover from a disaster using a SQL backup, we'd be down for days.
我目前正在 Amazon 的云基础设施上管理一个 MySQL 数据库,该数据库已增长到 160 GB。查询性能很好。成为噩梦的是备份、恢复、添加从属或其他任何处理整个数据集的事情,甚至是大表上的 DDL。干净地导入转储文件已经成为问题。为了使流程足够稳定以实现自动化,需要做出各种选择来优先考虑稳定性而不是性能。如果我们不得不使用 SQL 备份从灾难中恢复,我们将会停机数天。
Horizontally scaling SQL is also pretty painful, and in most cases leads to using it in ways you probably did not intend when you chose to put your data in SQL in the first place. Shards, read slaves, multi-master, et al, they are all really shitty solutions that add complexity to everything you ever do with the DB, and not one of them solves the problem; only mitigates it in some ways. I would strongly suggest looking at moving some of your data out of MySQL (or really any SQL) when you start approaching a dataset of a size where these types of things become an issue.
水平扩展 SQL 也非常痛苦,并且在大多数情况下会导致以您最初选择将数据放入 SQL 时可能不打算使用的方式使用它。分片、读从站、多主站等,它们都是非常糟糕的解决方案,增加了您对数据库所做的一切的复杂性,而且没有一个能解决问题;只能在某些方面减轻它。我强烈建议当您开始处理这些类型的事情成为问题的大小的数据集时,考虑将您的一些数据移出 MySQL(或实际上任何 SQL)。
回答by saint_groceon
Also watch out for complex joins. Transaction complexity can be a big factor in addition to transaction volume.
还要注意复杂的连接。除了交易量之外,交易复杂性也是一个重要因素。
Refactoring heavy queries sometimes offers a big performance boost.
重构繁重的查询有时会带来巨大的性能提升。
回答by jj33
I once was called upon to look at a mysql that had "stopped working". I discovered that the DB files were residing on a Network Appliance filer mounted with NFS2 and with a maximum file size of 2GB. And sure enough, the table that had stopped accepting transactions was exactly 2GB on disk. But with regards to the performance curve I'm told that it was working like a champ right up until it didn't work at all! This experience always serves for me as a nice reminder that there're always dimensions above and below the one you naturally suspect.
我曾经被要求查看“停止工作”的 mysql。我发现 DB 文件驻留在安装了 NFS2 且最大文件大小为 2GB 的 Network Appliance 文件管理器上。果然,停止接受事务的表在磁盘上正好有 2GB。但是关于性能曲线,我被告知它一直像冠军一样工作,直到它根本不起作用!这段经历对我来说总是一个很好的提醒,即在你自然怀疑的维度之上和之下总是存在维度。
回答by alditis
A point to consider is also the purpose of the system and the data in the day to day.
需要考虑的一点也是系统的目的和日常数据。
For example, for a system with GPS monitoring of cars is not relevant query data from the positions of the car in previous months.
例如,对于一个带有GPS 汽车监控的系统,没有从汽车前几个月的位置查询相关数据。
Therefore the data can be passed to other historical tables for possible consultation and reduce the execution times of the day to day queries.
因此,可以将数据传递到其他历史表以进行可能的咨询并减少日常查询的执行时间。