MySQL Amazon RDS 备份/快照实际上是如何工作的?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5249842/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 19:02:43  来源:igfitidea点击:

How does Amazon RDS backup/snapshot actually work?

mysqlamazon-web-serviceslatencyamazon-rds

提问by esilver

I am an Amazon RDS customer and am experiencing daily amazon RDS write latency spikes, corresponding roughly to the backup window. I will also see spikes at the end of a snapshot (case in point: running a snapshot takes appx 1 hour, and in the final 5 minutes, write latency spikes). I am running a multi-AZ m1.large deployment.

我是 Amazon RDS 客户,每天都会遇到亚马逊 RDS 写入延迟高峰,大致对应于备份窗口。我还会在快照结束时看到峰值(例如:运行快照需要大约 1 小时,在最后 5 分钟,写入延迟峰值)。我正在运行多可用区 m1.large 部署。

Is there anyone on Stack who can explain how Amazon RDS backup is actuallyworking? I've read the Amazon RDS docs, and as far as I can tell, Amazon RDS is not behaving according to spec. Specifically, these backup/snapshot operations should be hitting my replica, and therefore not causing any downtime/performance hit, or so I thought.

Stack 上有人可以解释 Amazon RDS 备份的实际工作原理吗?我已经阅读了 Amazon RDS 文档,据我所知,Amazon RDS 的行为不符合规范。具体来说,这些备份/快照操作应该会影响我的副本,因此不会造成任何停机时间/性能影响,或者我认为。

I can distill my problem into six questions:

我可以将我的问题提炼成六个问题:

  • What is technically happening during a snapshot and a backup, and how are they different? (If you answer this question, please tell me if you are able to empirically confirm your answer, or are simply quoting me documentation).
  • Is a spike in write latency to be expected during the backup window on a multi-AZ deployment?
  • Is a spike in write latency to be expected at the end of a snapshot on a multi-AZ deployment?
  • Would my write latency spike be even higher if I was not multi-AZ ?
  • Architecturally, would I be able to avoid these write latency spikes if I rolled my own database running on two m1.large EC2 instances?
  • Are there any configurations I can use that would avoid these write latency spikes while still hosting my DB with RDS, or am I effectively at the mercy of Amazon?
  • 快照和备份期间在技术上发生了什么,它们有何不同?(如果您回答了这个问题,请告诉我您是否能够凭经验确认您的答案,或者只是引用我的文档)。
  • 在多可用区部署的备份窗口期间是否会出现写入延迟高峰?
  • 在多可用区部署的快照结束时是否会出现写入延迟峰值?
  • 如果我不是多可用区,我的写入延迟峰值会更高吗?
  • 在架构上,如果我在两个 m1.large EC2 实例上运行自己的数据库,我是否能够避免这些写入延迟峰值?
  • 我是否可以使用任何配置来避免这些写入延迟峰值,同时仍然使用 RDS 托管我的数据库,或者我是否有效地受亚马逊的支配?

Bonus Question: where and how do you host your mysql database?

额外问题:您在哪里以及如何托管您的 mysql 数据库?

I can say that I have been generally happy with RDS except for these daily write latency issues. I love the built-in database monitoring and it was fairly simple to setup and get going.

我可以说,除了这些日常写入延迟问题之外,我对 RDS 总体上很满意。我喜欢内置的数据库监控,它的设置和使用相当简单。

Thanks!

谢谢!

amazon RDS write latency

亚马逊 RDS 写入延迟

采纳答案by Joshua

We also run several RDS instances, in addition to MySQL on some machines that we manage ourselves. I can't comment specifically, as I'm not an Amazon engineer, but several things I've learned that might explain what you're seeing:

除了在我们自己管理的一些机器上运行 MySQL 之外,我们还运行了多个 RDS 实例。我不能具体评论,因为我不是亚马逊工程师,但我学到的几件事可能会解释你所看到的:

  • Although Amazon does not share the backend details 100%, we strongly suspect that they are using their EBS system to back RDS databases.

  • This article helps explain EBS limitations and snapshot functionality http://blog.rightscale.com/2008/08/20/amazon-ebs-explained/Again, while it's not explicit, it would make sense for Amazon to be using this infrastructure to provide RDS services.

  • Typically, a MySQL backup, in contrast to a snapshot, involves using a tool like mysqldump to create a file of SQL statements that will then reproduce the database. The database does not need to be frozen to do this. With an EBS backend, the best practice is to freeze the database (pause all transactions) while you are snapshotting to avoid data corruption.

  • The spikes you're seeing at the ends of the backup window. If replication is paused by Amazon during the snapshot of your replica, the replica would then need to "catch up" on transactions when the snapshot was complete. This would cause a latency spike.

  • Replication across a multi-AZ deployment is inherently slower then a single AZ deployment. The price you pay for better redundancy.

  • 尽管 Amazon 并未 100% 共享后端详细信息,但我们强烈怀疑他们正在使用他们的 EBS 系统来支持 RDS 数据库。

  • 本文有助于解释 EBS 限制和快照功能http://blog.rightscale.com/2008/08/20/amazon-ebs-explained/同样,虽然它并不明确,但亚马逊使用此基础设施来提供RDS服务。

  • 通常,与快照相比,MySQL 备份涉及使用诸如 mysqldump 之类的工具来创建 SQL 语句文件,然后该文件将重现数据库。不需要冻结数据库来执行此操作。使用 EBS 后端,最佳做法是在创建快照时冻结数据库(暂停所有事务)以避免数据损坏。

  • 您在备份窗口末端看到的峰值。如果 Amazon 在您的副本快照期间暂停复制,则副本将需要在快照完成时“赶上”事务。这会导致延迟峰值。

  • 跨多可用区部署的复制本质上比单个可用区部署慢。您为更好的冗余而付出的代价。

回答by Anurag Kale

Amazon revealed the basic architecture that they use in Multi AZ deployments. This may help people to take decisions

亚马逊透露了他们在多可用区部署中使用的基本架构。这可能有助于人们做出决定

https://aws.amazon.com/blogs/database/amazon-rds-under-the-hood-multi-az/

https://aws.amazon.com/blogs/database/amazon-rds-under-the-hood-multi-az/