Java cassandra datastax 驱动程序抛出的写入超时
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21819035/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Write timeout thrown by cassandra datastax driver
提问by Jacob
While doing a bulk load of data, incrementing counters based on log data, I am encountering a timeout exception. Im using the Datastax 2.0-rc2 java driver.
在批量加载数据时,根据日志数据增加计数器,我遇到了超时异常。我使用的是 Datastax 2.0-rc2 java 驱动程序。
Is this an issue with the server not being able to keep up (ie server side config issue), or is this an issue with the client getting bored waiting for the server to respond? Either way, is there an easy config change I can make that would fix this?
这是服务器无法跟上的问题(即服务器端配置问题),还是客户端无聊等待服务器响应的问题?无论哪种方式,是否有一个简单的配置更改可以解决这个问题?
Exception in thread "main" com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write)
at com.datastax.driver.core.exceptions.WriteTimeoutException.copy(WriteTimeoutException.java:54)
at com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:271)
at com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:187)
at com.datastax.driver.core.Session.execute(Session.java:126)
at jason.Stats.analyseLogMessages(Stats.java:91)
at jason.Stats.main(Stats.java:48)
Caused by: com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write)
at com.datastax.driver.core.exceptions.WriteTimeoutException.copy(WriteTimeoutException.java:54)
at com.datastax.driver.core.Responses$Error.asException(Responses.java:92)
at com.datastax.driver.core.ResultSetFuture$ResponseCallback.onSet(ResultSetFuture.java:122)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:224)
at com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:373)
at com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:510)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write)
at com.datastax.driver.core.Responses$Error.decode(Responses.java:53)
at com.datastax.driver.core.Responses$Error.decode(Responses.java:33)
at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:165)
at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66)
... 21 more
One of the nodes reports this at roughly the time it occured:
其中一个节点在大约发生时报告了这一点:
ERROR [Native-Transport-Requests:12539] 2014-02-16 23:37:22,191 ErrorMessage.java (line 222) Unexpected exception during request
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(Unknown Source)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.read(Unknown Source)
at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
采纳答案by Jacob
While I don't understand the root cause of this issue, I was able to solve the problem by increasing the timeout value in the conf/cassandra.yaml file.
虽然我不明白这个问题的根本原因,但我能够通过增加 conf/cassandra.yaml 文件中的超时值来解决这个问题。
write_request_timeout_in_ms: 20000
回答by Christopher Batey
It is coordinator (so the server) timing out waiting for acknowledgements for the write.
它是协调器(因此服务器)超时等待写入确认。
回答by dvtoever
We experienced similar problems on a single node in an ESX cluster with SAN storage attached (which is not recommended by datastax, but we have no other options at this moment).
我们在连接了 SAN 存储的 ESX 集群中的单个节点上遇到了类似的问题(datastax 不推荐这样做,但目前我们没有其他选择)。
Note:the settings below can be a big blow to the maximum performance Cassandra can achieve, but we chose a stable system over high performance.
注意:下面的设置可能对 Cassandra 可以实现的最大性能造成很大的打击,但我们选择了一个稳定的系统而不是高性能。
While running iostat -xmt 1
we found high w_await times at the same time the WriteTimeoutExceptions occured. It turned out the memtable could not be written to disk within the default write_request_timeout_in_ms: 2000
setting.
在运行时,iostat -xmt 1
我们发现在 WriteTimeoutExceptions 发生的同时 w_await 时间很高。事实证明,在默认write_request_timeout_in_ms: 2000
设置下无法将 memtable 写入磁盘。
We significantly reduced the memtable size from 512Mb (defaults to 25% of heap space, which was 2Gb in our case) to 32Mb:
我们将 memtable 大小从 512Mb(默认为堆空间的 25%,在我们的例子中为 2Gb)显着减少到 32Mb:
# Total permitted memory to use for memtables. Cassandra will stop
# accepting writes when the limit is exceeded until a flush completes,
# and will trigger a flush based on memtable_cleanup_threshold
# If omitted, Cassandra will set both to 1/4 the size of the heap.
# memtable_heap_space_in_mb: 2048
memtable_offheap_space_in_mb: 32
We also slightly increated the write timeout to 3 seconds:
我们还略微将写入超时设置为 3 秒:
write_request_timeout_in_ms: 3000
Also make sure you write regularly to disk if you have high IO wait times:
如果 IO 等待时间较长,请确保定期写入磁盘:
#commitlog_sync: batch
#commitlog_sync_batch_window_in_ms: 2
#
# the other option is "periodic" where writes may be acked immediately
# and the CommitLog is simply synced every commitlog_sync_period_in_ms
# milliseconds.
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
These settings allowed the memtable to remain small and be written often. The exceptions were solved and we survived the stress tests that were run on the sytem.
这些设置允许 memtable 保持较小并经常写入。异常得到解决,我们在系统上运行的压力测试中幸存下来。
回答by Mumrah
Its worth double checking your GC settings for Cassandra.
值得仔细检查 Cassandra 的 GC 设置。
In my case I was using a semaphore to throttle async writes and still (sometimes) getting timeouts.
在我的情况下,我使用信号量来限制异步写入并且仍然(有时)超时。
It transpired that I was using unsuitable GC settings, I'd been using cassandra-unit for convenience which had the unintended consequence of running with the default VM settings. Consequently we would eventually trigger hit a stop-the-world GC resulting in a write timeout. Applying the same GC settings as my running cassandra docker image and all is fine.
结果发现我使用了不合适的 GC 设置,为了方便起见,我一直在使用 cassandra-unit,这会导致使用默认 VM 设置运行的意外后果。因此,我们最终会触发命中 stop-the-world GC,从而导致写入超时。应用与我正在运行的 cassandra docker 镜像相同的 GC 设置,一切都很好。
This might be an uncommon cause but it would have helped me so it seems worth recording here.
这可能是一个不常见的原因,但它会对我有所帮助,因此似乎值得在这里记录。