Java hbase-site.xml 中的zookeeper quorum 设置究竟是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4437620/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What exactly is the zookeeper quorum setting in hbase-site.xml?
提问by raj
What exactly is the zookeeper quorum setting in hbase-site.xml?
hbase-site.xml 中的zookeeper quorum 设置究竟是什么?
采纳答案by MrGomez
As described in hbase-default.xml,here's the setting:
如hbase-default.xml 中所述,这里是设置:
Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which we will start/stop ZooKeeper on.
ZooKeeper Quorum 中以逗号分隔的服务器列表。例如,“host1.mydomain.com,host2.mydomain.com,host3.mydomain.com”。默认情况下,对于本地和伪分布式操作模式,这设置为 localhost。对于完全分布式设置,这应该设置为 ZooKeeper 仲裁服务器的完整列表。如果在 hbase-env.sh 中设置了 HBASE_MANAGES_ZK,这是我们将在其上启动/停止 ZooKeeper 的服务器列表。
What this actually doeshas been answered by Edward J. Yoon here.With editing on my part, for clarity:
Edward J. Yoon在这里已经回答了这实际上做了什么。为清楚起见,我进行了编辑:
The Apache Zookeeper is a coordination service for distributed applications, like Google's Chubby. Many projects uses zookeeper, and we (Apache Hama) also use zookeeper for barrier synchronization of Bulk Synchronous Parallel computing framework.
Today, I surveyed more about the paxos and dynamic quorum features of the Zookeeper project, to better name the class
org.apache.hama.zookeeper.QuorumPeer
. Because of documentation is not enough ( http://hadoop.apache.org/zookeeper/docs/r3.0.0/api/index.html), I didn't understand the meaning of "quorum", as this term was somewhat odd to me. But, "org.apache.hama.zookeeper.QuorumPeer" is the proper name!! xDSo, what is the Quorum and why do we need a Quorum?
According to Wikipedia, Quorum is the minimum number of members of a deliberative body necessary to conduct the business of that group. Ordinarily, this is a majority of the people expected to be there, although many bodies may have a lower or higher quorum.
As you know, a Fault-Tolerant mechanism is one of the important functions of distributed system. The Quorum algorithm is used to prevent a split-brain condition. When split-brain condition occurs, according to the Quorum algorithm, zookeeper determines the "Primary Partition" and "Secondary Partition". Then, the servers in primary group receive and process user's request, and the servers in secondary group become read-only.
When does this system recover from a split-brain condition? When they're merged to one partition again. Internally, zookeeper uses atomic broadcast protocol instead of Paxos.
Apache Zookeeper 是分布式应用程序的协调服务,例如 Google 的 Chubby。很多项目都使用zookeeper,我们(Apache Hama)也使用zookeeper进行Bulk Synchronous Parallel计算框架的屏障同步。
今天,我调查了更多关于 Zookeeper 项目的 paxos 和动态仲裁功能,以更好地命名类
org.apache.hama.zookeeper.QuorumPeer
。由于文档不够(http://hadoop.apache.org/zookeeper/docs/r3.0.0/api/index.html),我不明白“法定人数”的含义,因为这个词有点奇怪对我来说。但是,“org.apache.hama.zookeeper.QuorumPeer”是正确的名称!!xD那么,什么是法定人数,为什么我们需要法定人数?
根据维基百科,法定人数是一个审议机构开展该团体业务所需的最少成员人数。通常,这是预期会在那里的大多数人,尽管许多机构的法定人数可能较低或较高。
众所周知,容错机制是分布式系统的重要功能之一。Quorum 算法用于防止出现裂脑情况。当发生裂脑情况时,zookeeper 根据 Quorum 算法确定“主分区”和“次分区”。然后,主组中的服务器接收并处理用户的请求,次组中的服务器变为只读。
这个系统什么时候从裂脑状态中恢复过来?当它们再次合并到一个分区时。在内部,zookeeper 使用原子广播协议而不是 Paxos。
You should also read the original version,in case I mistranslated the concepts he was trying to present.
您还应该阅读原始版本,以防我误译了他试图提出的概念。
My understanding of the quorum mechanism in Apache Zookeeperis it explicitly defines a replication quorum across several pre-defined hosts. If this quorum is not met, the partitions that disagree are split off to a secondary partition until Zookeeper can reintegrate them with the primary partition.
我对Apache Zookeeper 中仲裁机制的理解是它明确定义了跨多个预定义主机的复制仲裁。如果不满足此法定人数,则不同意的分区将拆分为辅助分区,直到 Zookeeper 可以将它们重新集成到主分区。
This adds more granularity to Hadoop's eventual consistencymodel. HBase, meanwhile, is currently in the process of further integrating Zookeeper with its code.
这为 Hadoop 的最终一致性模型增加了更多的粒度。与此同时,HBase 目前正在进一步将 Zookeeper 与其代码集成。
回答by Jean-Daniel Cryans
From the hbase-default.xml file:
从 hbase-default.xml 文件:
Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which we will start/stop ZooKeeper on.
ZooKeeper Quorum 中以逗号分隔的服务器列表。例如,“host1.mydomain.com,host2.mydomain.com,host3.mydomain.com”。默认情况下,对于本地和伪分布式操作模式,这设置为 localhost。对于完全分布式设置,这应该设置为 ZooKeeper 仲裁服务器的完整列表。如果在 hbase-env.sh 中设置了 HBASE_MANAGES_ZK,这是我们将在其上启动/停止 ZooKeeper 的服务器列表。
And from the Getting Started's Requirements section:
从入门的要求部分:
HBase depends on ZooKeeper as of release 0.20.0. HBase keeps the location of its root table, who the current master is, and what regions are currently participating in the cluster in ZooKeeper. Clients and Servers now must know their ZooKeeper Quorum locations before they can do anything else (Usually they pick up this information from configuration supplied on their CLASSPATH). By default, HBase will manage a single ZooKeeper instance for you. In standalone and pseudo-distributed modes this is usually enough, but for fully-distributed mode you should configure a ZooKeeper quorum (more info below).
从 0.20.0 版开始,HBase 依赖于 ZooKeeper。HBase 会在 ZooKeeper 中保存其根表的位置、当前 master 是谁以及当前参与集群的区域。客户端和服务器现在必须知道他们的 ZooKeeper Quorum 位置才能做任何其他事情(通常他们从他们的 CLASSPATH 上提供的配置中获取这些信息)。默认情况下,HBase 将为您管理单个 ZooKeeper 实例。在独立和伪分布式模式下,这通常就足够了,但是对于完全分布式模式,您应该配置 ZooKeeper 仲裁(更多信息如下)。
Hope that helps.
希望有帮助。