java cassandra 的 cqlsh 控制台中的操作超时错误
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29394382/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Operation Time Out Error in cqlsh console of cassandra
提问by Kaushal
I have a three nodes Cassandra Cluster and I have created one table which has more than 2,000,000 rows.
我有一个三节点的 Cassandra 集群,我创建了一个包含超过 2,000,000 行的表。
When I execute this (select count(*) from userdetails
) query in cqlsh, I got this error:
当我select count(*) from userdetails
在 cqlsh 中执行此 ( ) 查询时,出现此错误:
OperationTimedOut: errors={}, last_host=192.168.1.2
操作超时:错误={},last_host=192.168.1.2
When I run count function for less row or with limit 50,000 it works fine.
当我为更少的行或限制为 50,000 运行计数函数时,它工作正常。
采纳答案by Chris Lohfink
count(*) actually pages through all the data. So a select count(*) from userdetails
without a limit would be expected to timeout with that many rows. Some details here:
http://planetcassandra.org/blog/counting-key-in-cassandra/
count(*) 实际上是遍历所有数据。因此,select count(*) from userdetails
无限制的行会超时。这里有一些细节:http:
//planetcassandra.org/blog/counting-key-in-cassandra/
You may want to consider maintaining the count yourself, using Spark, or if you just want a ball park number you can grab it from JMX.
您可能需要考虑自己维护计数,使用 Spark,或者如果您只想要一个棒球场号码,您可以从 JMX 获取它。
To grab from JMX it can be a little tricky depending on your data model. To get the number of partitions grab the org.apache.cassandra.metrics:type=ColumnFamily,keyspace={{Keyspace}},scope={{Table?}},name=EstimatedColumnCountHistogram
mbean and sum up all the 90 values (this is what nodetool cfstats
outputs). It will only give you the number that exist in sstables so to make it more accurate you can do a flush or try to estimate number in memtables from the MemtableColumnsCount
mbean
根据您的数据模型,从 JMX 获取它可能有点棘手。要获取分区数,请获取org.apache.cassandra.metrics:type=ColumnFamily,keyspace={{Keyspace}},scope={{Table?}},name=EstimatedColumnCountHistogram
mbean 并将所有 90 个值相加(这是nodetool cfstats
输出)。它只会为您提供 sstables 中存在的数字,因此为了使其更准确,您可以进行刷新或尝试从MemtableColumnsCount
mbean 中估计 memtables 中的数字
For a very basic ballpark number you can grab the estimated partition counts from system.size_estimates
across all the ranges listed (note that this is only number on one node). Multiply that out by number of nodes, then divided by RF.
对于一个非常基本的大概数字,您可以从system.size_estimates
列出的所有范围中获取估计的分区计数(请注意,这只是一个节点上的数字)。将其乘以节点数,然后除以 RF。
回答by Sergey Shcherbakov
You can also increase timeout in the cqlsh command, e.g.:
您还可以在 cqlsh 命令中增加超时时间,例如:
cqlsh --request-timeout 120 myhost
回答by Sashank Bhogu
To change the client timeout limit in Apache Cassandra, there are two techniques:
要在 Apache Cassandra 中更改客户端超时限制,有两种技术:
Technique 1:Modify the cqlshrc file.
技巧一:修改cqlshrc文件。
Technique 2: Open the program cqlsh and modify the time specified using the client_timeout variable.
技巧二:打开程序cqlsh,修改client_timeout变量指定的时间。
For details to accomplish please refer the link: https://playwithcassandra.wordpress.com/2015/11/05/cqlsh-increase-timeout-limit/
有关完成的详细信息,请参阅链接:https: //playwithcassandra.wordpress.com/2015/11/05/cqlsh-increase-timeout-limit/
回答by Kyle Burke
I'm using Cassandra 3.4 and cqlsh to get record counts. It appears that there has been a code change in 3.4. cqlsh just calls cqlsh.py. Inside of cqlsh.py there is a DEFAULT_REQUEST_TIMEOUT_SECONDS
variable that defaults to 10 (seconds). I changed it to 3600 (1 hour) and now my SELECT count(*)
queries work.
我正在使用 Cassandra 3.4 和 cqlsh 来获取记录计数。似乎在 3.4 中发生了代码更改。cqlsh 只是调用 cqlsh.py。在 cqlsh.py 中有一个DEFAULT_REQUEST_TIMEOUT_SECONDS
默认为 10(秒)的变量。我将其更改为 3600(1 小时),现在我的SELECT count(*)
查询有效。
回答by Oleg Belous
if you use cqlsh: open the script in editor and find all words "timeout". Change default value from 10 to 60 and save the script.
如果您使用 cqlsh:在编辑器中打开脚本并找到所有单词“超时”。将默认值从 10 更改为 60 并保存脚本。
回答by Jasonw
having the same problem as you above if i do a count for a day, but as a work around, I split the count into two requests (12hours + 12hours), such as below.
如果我计算一天,就会遇到与您相同的问题,但作为解决方法,我将计数分成两个请求(12 小时 + 12 小时),如下所示。
cqlsh:jw_schema1> select count(*) from flight_statistics where insert_time >= '2015-08-20 00:00:00' and insert_time <= '2015-08-20 11:59:59' ALLOW FILTERING;
count
-------
42528
(1 rows)
cqlsh:jw_schema1> select count(*) from flight_statistics where insert_time >= '2015-08-20 12:00:00' and insert_time <= '2015-08-20 23:59:59' ALLOW FILTERING;
count
-------
86580
(1 rows)
cqlsh:jw_schema1>
回答by Mohammad Rahmati
I'm using Cassandra 3.11 and cqlsh to get record counts. My table is about 40,000,000 rows and i was forced with this problem. my problem solved with two changes:
我正在使用 Cassandra 3.11 和 cqlsh 来获取记录计数。我的表大约有 40,000,000 行,我不得不面对这个问题。我的问题通过两个更改解决了:
first is change all timeout configs in 'cassandra.yaml' on all node:
首先是更改所有节点上“cassandra.yaml”中的所有超时配置:
# 3,600,000 is one hour in ms
read_request_timeout_in_ms: 3600000
range_request_timeout_in_ms: 3600000
write_request_timeout_in_ms: 3600000
counter_write_request_timeout_in_ms: 3600000
cas_contention_timeout_in_ms: 3600000
truncate_request_timeout_in_ms: 3600000
request_timeout_in_ms: 3600000
slow_query_log_timeout_in_ms: 3600000
then restart cassandra on all node.
然后在所有节点上重新启动 cassandra。
and second is running 'cqlsh' with specify timeout like below:
第二个正在运行'cqlsh',指定超时,如下所示:
cqlsh --request-timeout=3600000 <myhost>