Java Cassandra NoHostAvailableException：在生产中尝试查询的所有主机都失败

Question

提问by abi_pat

We have 10 Cassandra nodes in production running Cassandra-2.1.8. We recently upgraded to 2.1.8 version. Previously we were using only 3 nodes running Cassandra-2.1.2. First we upgraded the initial 3 nodes from 2.1.2 to 2.1.8 (following the procedure as described in Upgrading Cassandra). Then we added 7 more nodes running Cassandra-2.1.8 in cluster. Then we started our client programs. For first few hours everything worked fine, but after few hours, we saw some errors in client program logs like

我们在生产环境中有 10 个 Cassandra 节点，运行 Cassandra-2.1.8。我们最近升级到了 2.1.8 版本。以前我们只使用 3 个运行 Cassandra-2.1.2 的节点。首先，我们将最初的 3 个节点从 2.1.2 升级到 2.1.8（按照升级 Cassandra 中所述的过程）。然后我们在集群中添加了另外 7 个运行 Cassandra-2.1.8 的节点。然后我们开始了我们的客户端程序。前几个小时一切正常，但几个小时后，我们在客户端程序日志中看到了一些错误，例如

Thread-0 [29/07/15 17:41:23.356] ERROR  com.cleartrail.entityprofiling.engine.InterpretationWriter - Error:com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: [/172.50.33.161:9041, /172.50.33.162:9041, /172.50.33.95:9041, /172.50.33.96:9041, /172.50.33.165:9041, /172.50.33.166:9041, /172.50.33.163:9041, /172.50.33.164:9041, /172.50.33.42:9041, /172.50.33.167:9041] - use getErrors() for details)
       at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65)
       at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:259)
       at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:175)
       at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
       at com.cleartrail.entityprofiling.engine.InterpretationWriter.WriteInterpretation(InterpretationWriter.java:430)
       at com.cleartrail.entityprofiling.engine.Profiler.buildProfile(Profiler.java:1042)
       at com.cleartrail.messageconsumer.consumer.KafkaConsumer.run(KafkaConsumer.java:336)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: [/172.50.33.161:9041, /172.50.33.162:9041, /172.50.33.95:9041, /172.50.33.96:9041, /172.50.33.165:9041, /172.50.33.166:9041, /172.50.33.163:9041, /172.50.33.164:9041, /172.50.33.42:9041, /172.50.33.167:9041] - use getErrors() for details)
       at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:102)
       at com.datastax.driver.core.RequestHandler.run(RequestHandler.java:176)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       at java.lang.Thread.run(Thread.java:745)

Now, I double checked the Firewall (as suggested in few posts), ports, timeouts in client as well as nodes and they all are correct.

现在，我仔细检查了防火墙（如几篇文章中所建议的）、端口、客户端和节点中的超时，它们都是正确的。

I am also not closing the connection anywhere in between. I am using batch queries with batch size of 1000 and the queries are update queries updating counters in my table with three columns

我也没有在两者之间的任何地方关闭连接。我正在使用批量大小为 1000 的批量查询，查询是更新查询，更新我的表中的计数器，包含三列

entity , twfwv , cvalue

实体 , twfwv , cvalue

where entity and twfwv columns are text and primary key and cvalue is counter column.

其中 entity 和 twfwv 列是文本和主键，cvalue 是计数器列。

I even restarted all my nodes (because this trick helped me in my dev environment when I faced the same exception) but its not helping. Please suggest what can be the probable problem here.

我什至重新启动了我的所有节点（因为当我遇到相同的异常时，这个技巧在我的开发环境中帮助了我）但它没有帮助。请建议这里可能存在的问题。

Answer 1

回答by bitsprint

My issue was resolved by checking the errors collection of NoHostAvailableExceptionas advised by Olivier Michallatin the comments. For me it was the protocol version on the cluster configuration. Mine was null, setting it to 3 fixed the problem.

我的问题是通过检查Olivier Michallat在评论中NoHostAvailableException建议的错误集合来解决的。对我来说，它是集群配置上的协议版本。我的为空，将其设置为 3 解决了问题。

Answer 2

回答by Matt

My issue was resolved by removing/using a property to set or unset the custom load balancing TokenAwarePolicy my connection was using, and relying on the default.

我的问题是通过删除/使用属性来设置或取消设置我的连接正在使用的自定义负载平衡 TokenAwarePolicy 并依赖默认值来解决的。

Specifically, I was trying to get a local spring boot app talking to a single dockerized Cassandra instance.

具体来说，我试图让本地 Spring Boot 应用程序与单个 dockerized Cassandra 实例对话。

        Cluster.Builder builder = Cluster.builder()
            .addContactPoints(cassandraProperties.getHosts())
            .withPort(cassandraProperties.getPort())
            .withProtocolVersion(ProtocolVersion.V4)
            .withRetryPolicy(new LoggingRetryPolicy(DefaultRetryPolicy.INSTANCE))
            .withCredentials(cassandraProperties.getUsername(), cassandraProperties.getPassword())
            .withCodecRegistry(codecRegistry);

        if (loadBalanced) {
            builder.withLoadBalancingPolicy(
                new TokenAwarePolicy(DCAwareRoundRobinPolicy.builder().withLocalDc(localDc).build()));
        }

Java Cassandra NoHostAvailableException：在生产中尝试查询的所有主机都失败

提问by abi_pat

回答by bitsprint

回答by Matt

相关推荐

最近更新

标签

Java Cassandra NoHostAvailableException：在生产中尝试查询的所有主机都失败

提问by abi_pat

回答by bitsprint

回答by Matt

相关推荐

Java 如何使用 selenium webdriver 在 chrome 中下载 pdf 文件

如何更改 Java 中的默认应用程序图标？

Java 项目的包结构？

Java 就性能 Lambda 或简单循环而言，哪个更好？

相关推荐

最近更新

标签