Java Zookeeper 错误:无法在选举地址打开到 X 的通道

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30940981/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 10:23:57  来源:igfitidea点击:

Zookeeper error: Cannot open channel to X at election address

javaamazon-web-servicesapache-zookeeper

提问by Rahul

I have installed zookeeper in 3 different aws servers. The following is the configuration in all the servers

我已经在 3 个不同的 aws 服务器中安装了 zookeeper。以下是所有服务器中的配置

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/var/zookeeper
clientPort=2181
server.1=x.x.x.x:2888:3888
server.2=x.x.x.x:2888:3888
server.3=x.x.x.x:2888:3888

All the three instance have a myidfile at var/zookeeperwith appropriate id in it. All the three servers have all ports open from the aws console. But when I run the zookeeper server, I get the following error in all the instances.

所有三个实例都有一个myid文件 at ,var/zookeeper其中包含适当的 id。所有三个服务器都从 aws 控制台打开了所有端口。但是当我运行zookeeper服务器时,在所有实例中都会出现以下错误。

2015-06-19 12:09:22,989 [myid:1] - WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] 
  - Cannot open channel to 2 at election address /x.x.x.x:3888
java.net.ConnectException: Connection refused
  at java.net.PlainSocketImpl.socketConnect(Native Method)
  at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
  at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
  at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
  at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
  at java.net.Socket.connect(Socket.java:579)
  at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
  at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
  at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
  at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2015-06-19 12:09:23,170 [myid:1] - WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382]
   - Cannot open channel to 3 at election address /x.x.x.x:3888
java.net.ConnectException: Connection refused
  at java.net.PlainSocketImpl.socketConnect(Native Method)
  at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
  at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
  at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
  at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
  at java.net.Socket.connect(Socket.java:579)
  at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
  at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
  at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
  at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2015-06-19 12:09:23,170 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] - Notification time out: 25600

采纳答案by espeirasbora

How have defined the ip of the local server in each node? If you have given the public ip, then the listener would have failed to connect to the port. You must specify 0.0.0.0 for the current node

每个节点的本地服务器ip是如何定义的?如果您提供了公共 ip,则侦听器将无法连接到该端口。您必须为当前节点指定 0.0.0.0

server.1=0.0.0.0:2888:3888
server.2=192.168.10.10:2888:3888
server.3=192.168.2.1:2888:3888

This change must be performed at the other nodes too.

此更改也必须在其他节点上执行。

回答by user5688074

This is what worked for me

这对我有用

Step 1:
Node 1:
zoo.cfg
server.1= 0.0.0.0:<port>:<port2>
server.2= <IP>:<port>:<port2>
.
.
.
server.n= <IP>:<port>:<port2>

Node 2 :
server.1= <IP>:<port>:<port2>
server.2= 0.0.0.0:<port>:<port2>
.
.
.
server.n= <IP>:<port>:<port2>


Now in location defined by datadir on your zoo.cfg
Node 1:
echo 1 > <datadir>/id

Node 2:
echo 2 > <datadir>/id

.
.
.


Node n:
echo n > <datadir>/id

This one helped me to start zoo keeper successfully but will know more once i start playing with it. Hope this helps.

这个帮助我成功地启动了动物园管理员,但一旦我开始玩它就会知道更多。希望这可以帮助。

回答by Abdurrahman Adebiyi

Had similar issues on a 3-Node zookeeper ensemble. Solution was as advised by espeirasbora and restarted.

在 3-Node zookeeper 集合上有类似的问题。解决方案按照espeirasbora的建议并重新启动。

So this was what I did

所以这就是我所做的

zookeeper1,zookeeper2 and zookeeper3

动物园管理员1、动物园管理员2和动物园管理员3

A. Issue:: znodes in my ensemble could not start

A. 问​​题:: 我的集合中的 znodes 无法启动

B. System SetUp:: 3 Znodes in three 3 machines

B. 系统设置:: 三台 3 台机器中的 3 个 Znode

C. Error::

C. 错误::

In my zookeper log file I could see the following errors

在我的 Zookeper 日志文件中,我可以看到以下错误

2016-06-26 14:10:17,484 [myid:1] - WARN  [SyncThread:1:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:1 took 1340ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide
2016-06-26 14:10:17,847 [myid:1] - WARN  [RecvWorker:2:QuorumCnxManager$RecvWorker@810] - Connection broken for id 2, my id = 1, error = 
java.io.EOFException
    at java.io.DataInputStream.readInt(DataInputStream.java:392)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:795)
2016-06-26 14:10:17,848 [myid:1] - WARN  [RecvWorker:2:QuorumCnxManager$RecvWorker@813] - Interrupting SendWorker
2016-06-26 14:10:17,849 [myid:1] - WARN  [SendWorker:2:QuorumCnxManager$SendWorker@727] - Interrupted while waiting for message on queue
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
    at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:879)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.access0(QuorumCnxManager.java:65)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:715)
2016-06-26 14:10:17,851 [myid:1] - WARN  [SendWorker:2:QuorumCnxManager$SendWorker@736] - Send worker leaving thread
2016-06-26 14:10:17,852 [myid:1] - WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when following the leader
java.io.EOFException
    at java.io.DataInputStream.readInt(DataInputStream.java:392)
    at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
    at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
    at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
    at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
    at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:846)
2016-06-26 14:10:17,854 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower

D. Actions & Resolution ::

D. 行动和解决方案 ::

On each znode a. I modified the configuration file $ZOOKEEPER_HOME/conf/zoo.cfg to set the machines IP to "0.0.0.0" while maintaining the IP addressof the other 2 znodes. b. restarted the znode c. checked the status d.Voila , I was ok

在每个 znode 上 我修改了配置文件 $ZOOKEEPER_HOME/conf/zoo.cfg 将机器 IP 设置为“0.0.0.0”,同时保持其他 2 个 znode 的 IP 地址。湾 重新启动 znode c. 检查状态 d.Voila ,我很好

See below

见下文

-------------------------------------------------

-------------------------------------------------

on Zookeeper1

在 Zookeeper1 上

#Before modification 
[zookeeper1]$ tail -3   $ZOOKEEPER_HOME/conf/zoo.cfg 
server.1=zookeeper1:2888:3888
server.2=zookeeper2:2888:3888
server.3=zookeeper3:2888:3888

#After  modification 
[zookeeper1]$ tail -3  $ZOOKEEPER_HOME/conf/zoo.cfg 
server.1=0.0.0.0:2888:3888
server.2=zookeeper2:2888:3888
server.3=zookeeper3:2888:3888

#Start the Zookeper (Stop and STart or restart )
[zookeeper1]$ $ZOOKEEPER_HOME/bin/zkServer.sh  start
ZooKeeper JMX enabled by default
ZooKeeper remote JMX Port set to 52128
ZooKeeper remote JMX authenticate set to false
ZooKeeper remote JMX ssl set to false
ZooKeeper remote JMX log4j set to true
Using config: /opt/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: follower

[zookeeper1]$ $ZOOKEEPER_HOME/bin/zkServer.sh  status
ZooKeeper JMX enabled by default
ZooKeeper remote JMX Port set to 52128
ZooKeeper remote JMX authenticate set to false
ZooKeeper remote JMX ssl set to false
ZooKeeper remote JMX log4j set to true
Using config: /opt/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: follower

---------------------------------------------------------

-------------------------------------------------- -------

on Zookeeper2

在 Zookeeper2 上

#Before modification 
[zookeeper2]$ tail -3   $ZOOKEEPER_HOME/conf/zoo.cfg 
server.1=zookeeper1:2888:3888
server.2=zookeeper2:2888:3888
server.3=zookeeper3:2888:3888

#After  modification 
[zookeeper2]$ tail -3  $ZOOKEEPER_HOME/conf/zoo.cfg 
server.1=zookeeper1:2888:3888
server.2=0.0.0.0:2888:3888
server.3=zookeeper3:2888:3888

#Start the Zookeper (Stop and STart or restart )
[zookeeper2]$ $ZOOKEEPER_HOME/bin/zkServer.sh  start
ZooKeeper JMX enabled by default
ZooKeeper remote JMX Port set to 52128
ZooKeeper remote JMX authenticate set to false
ZooKeeper remote JMX ssl set to false
ZooKeeper remote JMX log4j set to true
Using config: /opt/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: follower

[zookeeper2]$ $ZOOKEEPER_HOME/bin/zkServer.sh  status
ZooKeeper JMX enabled by default
ZooKeeper remote JMX Port set to 52128
ZooKeeper remote JMX authenticate set to false
ZooKeeper remote JMX ssl set to false
ZooKeeper remote JMX log4j set to true
Using config: /opt/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: follower

---------------------------------------------------------

-------------------------------------------------- -------

on Zookeeper3

在 Zookeeper3 上

#Before modification 
[zookeeper3]$ tail -3   $ZOOKEEPER_HOME/conf/zoo.cfg 
server.1=zookeeper1:2888:3888
server.2=zookeeper2:2888:3888
server.3=zookeeper3:2888:3888

#After  modification 
[zookeeper3]$ tail -3  $ZOOKEEPER_HOME/conf/zoo.cfg 
server.1=zookeeper1:2888:3888
server.2=zookeeper2:2888:3888
server.3=0.0.0.0:2888:3888

#Start the Zookeper (Stop and STart or restart )
[zookeeper3]$ $ZOOKEEPER_HOME/bin/zkServer.sh  start
ZooKeeper JMX enabled by default
ZooKeeper remote JMX Port set to 52128
ZooKeeper remote JMX authenticate set to false
ZooKeeper remote JMX ssl set to false
ZooKeeper remote JMX log4j set to true
Using config: /opt/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: follower

[zookeeper3]$ $ZOOKEEPER_HOME/bin/zkServer.sh  status
ZooKeeper JMX enabled by default
ZooKeeper remote JMX Port set to 52128
ZooKeeper remote JMX authenticate set to false
ZooKeeper remote JMX ssl set to false
ZooKeeper remote JMX log4j set to true
Using config: /opt/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: follower

回答by slackey

Here is some ansible jinja2 template info for automating the build of a cluster with the 0.0.0.0 hostname in zoo.cfg

这是一些 ansible jinja2 模板信息,用于在 zoo.cfg 中使用 0.0.0.0 主机名自动构建集群

{% for url in zookeeper_hosts_list %}
  {%- set url_host = url.split(':')[0] -%}
  {%- if url_host == ansible_fqdn or url_host in     ansible_all_ipv4_addresses -%}
server.{{loop.index0}}=0.0.0.0:2888:3888
{% else %}
server.{{loop.index0}}={{url_host}}:2888:3888
{% endif %}
{% endfor %}

回答by Carles Figuerola

If your own hostname resolves to 127.0.0.1 (In my case, the hostname was in /etc/hosts), zookeeper won't start up without having 0.0.0.0 in the zoo.cfg file, but if your hostname resolves to the actual machine's IP, you can put it's own hostname in the config file.

如果您自己的主机名解析为 127.0.0.1(在我的情况下,主机名在 /etc/hosts 中),zookeeper 不会在 zoo.cfg 文件中没有 0.0.0.0 的情况下启动,但如果您的主机名解析为实际机器的IP,你可以把它自己的主机名放在配置文件中。

回答by aluenkinglee

I met the save question and solved it.

我遇到了保存问题并解决了它。

make sure the myid is the save with your configuration in the zoo.cfg.

确保 myid 与您在 zoo.cfg 中的配置一起保存。

please check your zoo.cfg file in your confdirectory, which contains such content.

请检查您conf目录中的zoo.cfg 文件,其中包含此类内容。

server.1=zookeeper1:2888:3888  
server.2=zookeeper2:2888:3888  
server.3=zookeeper3:2888:3888  

and check the myid in your server dataDir directory. For example:

并检查服务器 dataDir 目录中的 myid。例如:

let's say the dataDirdefined on the zoo.cfgis '/home/admin/data'

比方说,dataDir在定义zoo.cfgIS'/home/admin/data'

then on zookeeper1, you must have a file named myidand have value 1 on this file ;on zookeeper2, you must have a file named myidand have value 2 on this file; on zookeeper3, you must have a file named myidand have value 3 on this file.

那么在zookeeper1上,你必须有一个名为myid的文件,并且这个文件的值为1;在zookeeper2上,你必须有一个名为myid的文件,并且这个文件的值为2;在 zookeeper3 上,您必须有一个名为myid的文件,并且该文件的值为 3。

if not configured like this, the server will listen on a wrong ip:port.

如果没有这样配置,服务器将侦听错误的 ip:port。

回答by Abdul Mohsin

In mycase, the issue was, I had to start all the three zookeeper servers, Only then I was able to connect to zookeeper server using ./zkCli.sh

在我的情况下,问题是,我必须启动所有三个zookeeper服务器,只有这样我才能使用连接到zookeeper服务器 ./zkCli.sh

回答by SureshKumar

We faced the same issue , for our case the root cause of the problem is too-many number of client connections . The default ulimit on aws ec2 instance is 1024 and this causes zookeeper nodes not able to communicate with each other .

我们遇到了同样的问题,对于我们的案例,问题的根本原因是客户端连接数量过多。aws ec2 实例上的默认 ulimit 是 1024,这会导致 zookeeper 节点无法相互通信。

The fix for this is change the ulimit to a higher number -> (> ulimit -n 20000 ) stop and start zookeeper.

解决方法是将 ulimit 更改为更高的数字 -> (> ulimit -n 20000 ) 停止并启动 zookeeper。

回答by dyang

I had a similar issue. The status on 2 of my three zookeeper nodes was listed as "standalone", even though the zoo.cfg file indicated that it should be clustered. My third node couldn't start, with the error you described. I think what fixed it for me was running zkServer.sh startin quick succession across my three nodes, such that zookeeper was running before the zoo.cfg initLimit was reached. Hope this works for someone out there.

我有一个类似的问题。我的三个 zookeeper 节点中的两个节点的状态被列为“独立”,即使 zoo.cfg 文件表明它应该是集群的。我的第三个节点无法启动,出现您描述的错误。我认为为我修复它的是zkServer.sh start在我的三个节点上快速连续运行,这样 zookeeper 在达到 zoo.cfg initLimit 之前正在运行。希望这对那里的人有用。