Java 主分片不活动或未分配是已知节点?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27547091/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Primary shard is not active or isn't assigned is a known node ?
提问by Prem Singh Bist
I am running an elastic search version 4.1 on windows 8. I tried to index a document through java. When running a JUNIT test the error appears as below.
我在 Windows 8 上运行弹性搜索版本 4.1。我试图通过 java 索引文档。运行 JUNIT 测试时,错误显示如下。
org.elasticsearch.action.UnavailableShardsException: [wms][3] Primary shard is not active or isn't assigned is a known node. Timeout: [1m], request: index {[wms][video][AUpdb-bMQ3rfSDgdctGY], source[{
"fleetNumber": "45",
"timestamp": "1245657888",
"geoTag": "73.0012312,-123.00909",
"videoName": "timestamp.mjpeg",
"content": "ASD123124NMMM"
}]}
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.retryBecauseUnavailable(TransportShardReplicationOperationAction.java:784)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.doStart(TransportShardReplicationOperationAction.java:402)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.onTimeout(TransportShardReplicationOperationAction.java:500)
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:239)
at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:497)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
I can not figure out, why causes this error to happen. When a delete data or index it works fine. What might be the possible cause of it.
我想不通,为什么会导致此错误发生。当删除数据或索引时它工作正常。可能的原因是什么。
采纳答案by Alexandre Mélard
you should look at that link: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html
你应该看看那个链接:http: //www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html
and that part in particular:
尤其是那部分:
cluster.routing.allocation.disk.watermark.low controls the low watermark for disk usage. It defaults to 85%, meaning ES will not allocate new shards to nodes once they have more than 85% disk used. It can also be set to an absolute byte value (like 500mb) to prevent ES from allocating shards if less than the configured amount of space is available.
cluster.routing.allocation.disk.watermark.high controls the high watermark. It defaults to 90%, meaning ES will attempt to relocate shards to another node if the node disk usage rises above 90%. It can also be set to an absolute byte value (similar to the low watermark) to relocate shards once less than the configured amount of space is available on the node.
cluster.routing.allocation.disk.watermark.low 控制磁盘使用的低水印。它默认为 85%,这意味着一旦节点使用了超过 85% 的磁盘,ES 将不会为节点分配新的分片。它也可以设置为绝对字节值(如 500mb),以防止 ES 在可用空间少于配置量时分配分片。
cluster.routing.allocation.disk.watermark.high 控制高水印。它默认为 90%,这意味着如果节点磁盘使用率超过 90%,ES 将尝试将分片重新定位到另一个节点。它也可以设置为绝对字节值(类似于低水位线)以在节点上可用空间少于配置的空间量时重新定位分片。
回答by Zeeshan
In my case the culprit was port 9300. It was blocked.
就我而言,罪魁祸首是端口 9300。它被阻止了。
Elasticsearch will bind to a single port for both HTTP and the node/transport APIs.
It'll try the lowest available port first, and if it is already taken, try the next. If you run a single node on your machine, it'll only bind to 9200 and 9300.
Elasticsearch 将绑定到 HTTP 和节点/传输 API 的单个端口。
它将首先尝试最低的可用端口,如果已经被占用,则尝试下一个。如果您在机器上运行单个节点,它只会绑定到 9200 和 9300。
So I unblocked port 9300 and I was good to go.
所以我解锁了 9300 端口,我很高兴去。
In REDHAT linux to unblock a port.
在REDHAT linux 中解锁一个端口。
sudo firewall-cmd --zone=public --add-port=9300/tcp --permanent
sudo firewall-cmd --reload
sudo iptables-save | grep 9300
回答by avp
I faced the exact same error and in my case, I had multiple master and data nodes. Master nodes were added to the load balancer but data nodes were not. So master wasn't able to communicate with the data node.
我遇到了完全相同的错误,就我而言,我有多个主节点和数据节点。主节点被添加到负载均衡器,但数据节点没有。所以主节点无法与数据节点通信。
As soon as I brought all the data nodes in the load balancer, my problem was fixed.
一旦我将所有数据节点都带入负载均衡器,我的问题就解决了。
回答by avivamg
The Problem: seems that elasticsearch stops sending data to kibana as the disk space is exceeded. You get org.elasticsearch.action.UnavailableShardsException
and timeout based on the fact that your primary shard is not active. To strengthen the theory - run sudo df -h
and You'll probably might get high percentages of data volumes from /var/data
in your machine.
问题:当超出磁盘空间时,elasticsearch 似乎停止向 kibana 发送数据。org.elasticsearch.action.UnavailableShardsException
根据您的主分片未处于活动状态,您获得并超时。为了加强理论 - 运行sudo df -h
并且您可能会从/var/data
您的机器中获得高百分比的数据量。
Explanation: according to documentation on elasticserach disk space shard allocation, Elasticsearch considers the available disk space on a node before deciding whether to allocate new shards to that node or to actively relocate shards away from that node. You have 4 variables that need to be set in order to override the default disk space shard allocation
说明:根据有关 elasticserach 磁盘空间分片分配的文档,Elasticsearch 在决定是将新分片分配给该节点还是主动将分片从该节点重新定位之前会考虑该节点上的可用磁盘空间。您需要设置 4 个变量以覆盖默认的磁盘空间分片分配
1.cluster.routing.allocation.disk.threshold_enabledDefaults to true. Set to false to disable the disk allocation decider. 2.cluster.routing.allocation.disk.watermark.lowControls the low watermark for disk usage. It defaults to 85%, meaning that Elasticsearch will not allocate shards to nodes that have more than 85% disk used. It can also be set to an absolute byte value (like 500mb) to prevent Elasticsearch from allocating shards if less than the specified amount of space is available. This setting has no effect on the primary shards of newly-created indices but will prevent their replicas from being allocated.
3.cluster.routing.allocation.disk.watermark.highControls the high watermark. It defaults to 90%, meaning that Elasticsearch will attempt to relocate shards away from a node whose disk usage is above 90%. It can also be set to an absolute byte value (similarly to the low watermark) to relocate shards away from a node if it has less than the specified amount of free space. This setting affects the allocation of all shards, whether previously allocated or not.
4.cluster.routing.allocation.disk.watermark.flood_stageControls the flood stage watermark. It defaults to 95%, meaning that Elasticsearch enforces a read-only index block (index.blocks.read_only_allow_delete) on every index that has one or more shards allocated on the node that has at least one disk exceeding the flood stage. This is a last resort to prevent nodes from running out of disk space. The index block is automatically released once the disk utilization falls below the high watermark.
1. cluster.routing.allocation.disk.threshold_enabled默认为true。设置为 false 以禁用磁盘分配决策程序。2. cluster.routing.allocation.disk.watermark.low控制磁盘使用的低水印。它默认为 85%,这意味着 Elasticsearch 不会将分片分配给使用超过 85% 磁盘的节点。它还可以设置为绝对字节值(如 500mb),以防止 Elasticsearch 在可用空间少于指定的空间量时分配分片。此设置对新创建索引的主分片没有影响,但会阻止分配它们的副本。
3. cluster.routing.allocation.disk.watermark.high控制高水印。它默认为 90%,这意味着 Elasticsearch 将尝试将分片从磁盘使用率高于 90% 的节点重新定位。它还可以设置为绝对字节值(类似于低水位线)以在节点的可用空间少于指定量时将分片从节点重新定位。此设置会影响所有分片的分配,无论之前是否已分配。
4. cluster.routing.allocation.disk.watermark.flood_stage控制洪水阶段水印。它默认为 95%,这意味着 Elasticsearch 在每个索引上强制执行只读索引块 (index.blocks.read_only_allow_delete),该索引在至少一个磁盘超过泛滥阶段的节点上分配了一个或多个分片。这是防止节点耗尽磁盘空间的最后手段。一旦磁盘利用率低于高水位线,索引块就会自动释放。
Solution:Now lets perform an api call ,edit the configuration ,and increase the disk space shard allocation limitation (from 90 defaults to 95%-97%):
解决方案:现在让我们执行 api 调用,编辑配置,并增加磁盘空间分片分配限制(从 90 默认值到 95%-97%):
curl -XPUT -H 'Content-Type: application/json' 'localhost:9200/_cluster/settings'
-d '{ "transient":{
"cluster.routing.allocation.disk.watermark.low":"95%",
"cluster.routing.allocation.disk.watermark.high": "97%",
"cluster.routing.allocation.disk.watermark.flood_stage": "98%",
"cluster.info.update.interval": "1m"
}}'