Java Cassandra NoHostAvailableException:在生产中尝试查询的所有主机都失败
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31705443/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Cassandra NoHostAvailableException: All host(s) tried for query failed in Production
提问by abi_pat
We have 10 Cassandra nodes in production running Cassandra-2.1.8. We recently upgraded to 2.1.8 version. Previously we were using only 3 nodes running Cassandra-2.1.2. First we upgraded the initial 3 nodes from 2.1.2 to 2.1.8 (following the procedure as described in Upgrading Cassandra). Then we added 7 more nodes running Cassandra-2.1.8 in cluster. Then we started our client programs. For first few hours everything worked fine, but after few hours, we saw some errors in client program logs like
我们在生产环境中有 10 个 Cassandra 节点,运行 Cassandra-2.1.8。我们最近升级到了 2.1.8 版本。以前我们只使用 3 个运行 Cassandra-2.1.2 的节点。首先,我们将最初的 3 个节点从 2.1.2 升级到 2.1.8(按照升级 Cassandra 中所述的过程)。然后我们在集群中添加了另外 7 个运行 Cassandra-2.1.8 的节点。然后我们开始了我们的客户端程序。前几个小时一切正常,但几个小时后,我们在客户端程序日志中看到了一些错误,例如
Thread-0 [29/07/15 17:41:23.356] ERROR com.cleartrail.entityprofiling.engine.InterpretationWriter - Error:com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: [/172.50.33.161:9041, /172.50.33.162:9041, /172.50.33.95:9041, /172.50.33.96:9041, /172.50.33.165:9041, /172.50.33.166:9041, /172.50.33.163:9041, /172.50.33.164:9041, /172.50.33.42:9041, /172.50.33.167:9041] - use getErrors() for details)
at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65)
at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:259)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:175)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
at com.cleartrail.entityprofiling.engine.InterpretationWriter.WriteInterpretation(InterpretationWriter.java:430)
at com.cleartrail.entityprofiling.engine.Profiler.buildProfile(Profiler.java:1042)
at com.cleartrail.messageconsumer.consumer.KafkaConsumer.run(KafkaConsumer.java:336)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: [/172.50.33.161:9041, /172.50.33.162:9041, /172.50.33.95:9041, /172.50.33.96:9041, /172.50.33.165:9041, /172.50.33.166:9041, /172.50.33.163:9041, /172.50.33.164:9041, /172.50.33.42:9041, /172.50.33.167:9041] - use getErrors() for details)
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:102)
at com.datastax.driver.core.RequestHandler.run(RequestHandler.java:176)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Now, I double checked the Firewall (as suggested in few posts), ports, timeouts in client as well as nodes and they all are correct.
现在,我仔细检查了防火墙(如几篇文章中所建议的)、端口、客户端和节点中的超时,它们都是正确的。
I am also not closing the connection anywhere in between. I am using batch queries with batch size of 1000 and the queries are update queries updating counters in my table with three columns
我也没有在两者之间的任何地方关闭连接。我正在使用批量大小为 1000 的批量查询,查询是更新查询,更新我的表中的计数器,包含三列
entity , twfwv , cvalue
实体 , twfwv , cvalue
where entity and twfwv columns are text and primary key and cvalue is counter column.
其中 entity 和 twfwv 列是文本和主键,cvalue 是计数器列。
I even restarted all my nodes (because this trick helped me in my dev environment when I faced the same exception) but its not helping. Please suggest what can be the probable problem here.
我什至重新启动了我的所有节点(因为当我遇到相同的异常时,这个技巧在我的开发环境中帮助了我)但它没有帮助。请建议这里可能存在的问题。
回答by bitsprint
My issue was resolved by checking the errors collection of NoHostAvailableException
as advised by Olivier Michallatin the comments. For me it was the protocol version on the cluster configuration. Mine was null, setting it to 3 fixed the problem.
我的问题是通过检查Olivier Michallat在评论中NoHostAvailableException
建议的错误集合来解决的。对我来说,它是集群配置上的协议版本。我的为空,将其设置为 3 解决了问题。
回答by Matt
My issue was resolved by removing/using a property to set or unset the custom load balancing TokenAwarePolicy my connection was using, and relying on the default.
我的问题是通过删除/使用属性来设置或取消设置我的连接正在使用的自定义负载平衡 TokenAwarePolicy 并依赖默认值来解决的。
Specifically, I was trying to get a local spring boot app talking to a single dockerized Cassandra instance.
具体来说,我试图让本地 Spring Boot 应用程序与单个 dockerized Cassandra 实例对话。
Cluster.Builder builder = Cluster.builder()
.addContactPoints(cassandraProperties.getHosts())
.withPort(cassandraProperties.getPort())
.withProtocolVersion(ProtocolVersion.V4)
.withRetryPolicy(new LoggingRetryPolicy(DefaultRetryPolicy.INSTANCE))
.withCredentials(cassandraProperties.getUsername(), cassandraProperties.getPassword())
.withCodecRegistry(codecRegistry);
if (loadBalanced) {
builder.withLoadBalancingPolicy(
new TokenAwarePolicy(DCAwareRoundRobinPolicy.builder().withLocalDc(localDc).build()));
}