用于 Rails 3 应用程序的 MySQL 集群 (NDB) 与 MySQL 复制 (InnoDB):优点/缺点?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5300490/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 19:06:29  来源:igfitidea点击:

MySQL Cluster (NDB) vs MySQL Replication (InnoDB) for Rails 3 apps: pros/cons?

mysqlruby-on-rails

提问by konung

We are doing an overview of our current systems, trying to figure out if we can improve performance & reliability.

我们正在对我们当前的系统进行概述,试图弄清楚我们是否可以提高性能和可靠性。

Currently we run a bunch of internal Rails apps and our Rails based website. Some are Rails 3 already, some are being converted to Rails 3. They all connect to the following MySQL Setup.

目前我们运行着一堆内部 Rails 应用程序和我们基于 Rails 的网站。有些已经是 Rails 3,有些正在转换为 Rails 3。它们都连接到以下 MySQL 安装程序。

mysql01 ( master server) => mysql02 (slave)=> ( daily DB backups to a drive, that is backed up on a daily, weekly, monthly & semi-annual basis).

mysql01 ( master server) => mysql02 (slave)=>(每天对驱动器进行数据库备份,即每天、每周、每月和每半年备份一次)。

All writes happen on mysql01 and most short reads go to it as well, some "more resource consuming reads" ( like monthly/weekly reports that take 3-10 minutes to run and dump data into csv or backups) go to mysql02 server. We get about 3-5K visits per day to our site, and have about 20-30 internal users, that use various apps daily for inventory , order processing, etc. So these servers are not particularly under heavy loads other then those reports, that run of the slave anyways.

所有写操作都发生在 mysql01 上,大多数短读操作也会发生,一些“消耗更多资源的读操作”(比如需要 3-10 分钟运行并将数据转储到 csv 或备份中的每月/每周报告)转到 mysql02 服务器。我们每天大约有 3-5K 访问我们的网站,并且有大约 20-30 个内部用户,他们每天使用各种应用程序进行库存、订单处理等。因此,除了那些报告之外,这些服务器并没有特别承受重载,无论如何都要跑奴隶。

All servers run in a virtualized XENpool on Debian Lenny VMs.

所有服务器都在virtualized XENDebian Lenny VM的池中运行。

So we are doing a review of the systems, and somebody threw a suggestion of switching to MySQL Cluster (NDB)setup. I know of it in theory, but have never actually run it. So does anyone who had experience with it know of any pro / cons vs our current setup, and of any particular caveats when it involves Ruby / Rails applications?

所以我们正在对系统进行,有人提出了切换到MySQL Cluster (NDB)设置的建议。我在理论上知道它,但从未真正运行过它。那么有经验的人是否知道与我们当前设置相比的任何优点/缺点,以及当它涉及 Ruby/Rails 应用程序时的任何特定警告?

回答by Mat Keep

There is a good comparison of InnoDB and MySQL Cluster (ndb) recently posted to the docs...worth taking a look: http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-compared.html

最近发布到文档的 InnoDB 和 MySQL Cluster (ndb) 有一个很好的比较......值得一看:http: //dev.mysql.com/doc/refman/5.1/en/mysql-cluster-compared。 html

The Cluster architecture consists of a pool of MySQL Servers that are accessed by the application(s); these MySQL Servers don't actually store the Cluster data, the data is partitioned over the pool of data nodes below. Every MySQL Server has access to the data in all of the data nodes. If one MySQL server changes a piece of data then it is instantly visible to all of the other MySQL Servers.

集群架构由应用程序访问的 MySQL 服务器池组成;这些 MySQL 服务器实际上并不存储集群数据,数据在下面的数据节点池中进行分区。每个 MySQL 服务器都可以访问所有数据节点中的数据。如果一个 MySQL 服务器更改了一段数据,那么所有其他 MySQL 服务器都可以立即看到它。

Obviously, this architecture makes it extremely easy to scale out the database. Unlike sharding, the application doesn't need to know where the data is held - it can just load balance across all available MySQL Servers. Unlike scaling out with MySQL replication Cluster allows you to scale writes just as well as reads. New data nodes or MySQL servers can be added to an existing Cluster with no loss of service to the application.

显然,这种架构使得扩展数据库变得非常容易。与分片不同,应用程序不需要知道数据保存在哪里——它可以在所有可用的 MySQL 服务器之间进行负载平衡。与使用 MySQL 复制扩展不同,集群允许您像读取一样扩展写入。新的数据节点或 MySQL 服务器可以添加到现有的集群中,而不会丢失应用程序的服务。

MySQL Cluster's shared-nothing architecture means that it can deliver extremely high availability (99.999%+). Every time you change data, it is synchronously replicated to a second data node; if one data node fails then the applications read & write requests are automatically handled by the backup data node.

MySQL Cluster 的无共享架构意味着它可以提供极高的可用性 (99.999%+)。每次更改数据时,都会同步复制到第二个数据节点;如果一个数据节点出现故障,那么应用程序的读写请求将由备份数据节点自动处理。

Due to the distributed nature of MySQL Cluster, some operations can be slower (for example JOINs that have thousands of interim results - though there is a prototype solution available which addresses this) but others can be very fast and can scale extremely well (e.g. primary key reads and writes). You have the option of storing tables (or even columns) in memory or on disk and by choosing the memory option (with changes checkpointed to disk in the backgoround) transactions can be veryquick.

由于 MySQL 集群的分布式特性,某些操作可能会较慢(例如,具有数千个临时结果的 JOIN - 尽管有解决此问题的原型解决方案可用),但其他操作可能非常快并且可以非常好地扩展(例如主键读取和写入)。您可以选择在内存或磁盘中存储表(甚至列),并且通过选择内存选项(在后台将更改检查点到磁盘)事务可以非常快。

MySQL Cluster can be more complex to set up than a single MySQL server but it can prevent you having to implement sharding or read/write splitting in your application. Swings and roundabouts.

MySQL Cluster 的设置可能比单个 MySQL 服务器更复杂,但它可以防止您必须在应用程序中实现分片或读/写拆分。秋千和回旋处。

To get the best performance and scalability out of MySQL Cluster you need may need to tweak your application (see Cluster performance tuning white paper: http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster_perfomance.php). If you own the application this isn't normally a big deal but if you're using someone else's application that you can't modify then it could be a problem.

为了从 MySQL 集群中获得最佳性能和可扩展性,您可能需要调整您的应用程序(请参阅集群性能调整白皮书:http: //www.mysql.com/why-mysql/white-papers/mysql_wp_cluster_perfomance.php)。如果您拥有该应用程序,这通常不是什么大问题,但如果您正在使用其他人的应用程序而您无法修改,那么这可能是一个问题。

A final note is that it doesn't need to be all or nothing - you can choose to store some of your tables in Cluster and some using other storage engines, this is a per-table option. Also you can replicate between Cluster and other storage engines (for example, use Cluster for your run-time database and then replicate to InnoDB to generate complex reports).

最后要注意的是,它不需要全部或全部 - 您可以选择将一些表存储在 Cluster 中,而另一些使用其他存储引擎,这是每个表的选项。您还可以在 Cluster 和其他存储引擎之间进行复制(例如,将 Cluster 用于您的运行时数据库,然后复制到 InnoDB 以生成复杂的报告)。