墨痕

2016-01-06 OPS

Elasticsearch使用小结

1. Elasticsearch写入流程
2. Elasticsearch shard写入细节
3. Elasticsearch读取流程
4. Elasticsearch搜索流程
5. Elasticsearch 集群配置
6. Elasticsearch启动 & 停止
7. 修复unassigned shards
8. Elasticsearch API
9. Elasticsearch 优化配置
10. Curator
11. Elasticdump

Elasticsearch写入流程

client根据配置选择node发送写请求，该node称为协调节点(coordinating node)。不管master还是data节点，接收请求的则称之为协调节点(coordinating node)
协调节点先通过计算(shard = hash(doc id) % <num of primary shards>)，找出document在哪个主分片(primary shard)上，再找出主分片在哪个node上。将写请求转发到该node
node将数据写入到主分片(primary shard)，再将数据并行发送到副本分片 (数据写入shard有更是深入细节，详见下节)
副本分片都写入成功后，通知node副本分片写入成功，至此数据才算成功写入Elasticsearch
node通知协调节点，数据写入成功
协调节点通知client，数据写入成功

Elasticsearch shard写入细节

数据写入到memory buffer(此时ES无法搜索到此数据)，同时将数据写入到translog buffer
默认每隔1s执行refresh操作，将数据从memory buffer写到FileSystem Cache中，生成segment file(此时ES可搜索到此数据)
1
2
3
# 可修改refresh默认间隔
PUT /<index>/_settings
{ "refresh_interval": "1s" }
refresh操作后，memory buffer被清空，translog buffer没清空而不断积累。默认每隔5s数据从translog buffer写入translog(磁盘)
1
2
3
# 可修改translog写入磁盘间隔
PUT /<index>/_settings
{"index.translog.sync_interval": "5s"}
当translog size到达一定程度或默认每隔30 mins，执行flush操作，将FileSystem Cache中数据写入到磁盘
收尾工作，数据成功被写入磁盘后，清空并删除现有translog，重新启用一个新的translog

Elasticsearch读取流程

client根据配置选择node发送读请求，该node称为协调节点(coordinating node)。不管master还是data节点，接收请求的则称之为协调节点(coordinating node)
协调节点先通过计算(shard = hash(doc id) % <num of primary shards>)，找到document所在分片(primary shard/replica shard)，使用round-robin随机轮询算法选择node进行读取
node返回document数据给协调节点
协调节点返回document数据给client

Elasticsearch搜索流程

client根据配置选择node发送搜索请求，该node称为协调节点(coordinating node)。不管master还是data节点，接收请求的则称之为协调节点(coordinating node)
协调节点将请求转发到所有分片(shard)对应的主分片(primary shard)或副本分片(replica shard)
query phase: 每个分片(shard)将搜索结果(doc id)返回给协调节点，协调节点对所有结果进行合并、排序等产生最终结果
fetch phase: 协调节点根据最终结果(doc id)，到对应的node拉去document数据返回给client

Elasticsearch 集群配置

同一个集群中cluster.name相同。Elasticsearch能通过多播发现节点，具有相同cluster.name的Elasticsearch实例自动组成集群，但为了快速发现及避免网络拓扑变化带来的问题，一般会在主节点使用discovery.zen.ping.unicast.hosts指定其他节点。

$> vim /usr/local/elasticsearch/config/elasticsearch.yml

path.data: /data
path.logs: /var/wwwlog/elasticsearch
path.plugins: /usr/local/elasticsearch/plugins
network.host: 0.0.0.0
http.port: 9200
bootstrap.mlockall: true
indices.fielddata.cache.size: 75%
indices.breaker.fielddata.limit: 85%
threadpool.search.queue_size: 10000

#集群配置_主节点
cluster.name: elasticsearch_cluster_mogl
node.name: "master_node_10.0.6.6"
node.master: true
node.data: true
discovery.zen.ping.multicast.enabled: true
discovery.zen.ping.unicast.hosts: ["10.0.6.5", "10.0.1.155"]
cluster.routing.allocation.disk.threshold_enabled: false
cluster.routing.allocation.disk.watermark.low: 90%
cluster.routing.allocation.disk.watermark.high: 95%

Elasticsearch启动 & 停止

启动

1	/usr/local/elasticsearch/bin/elasticsearch -d

停止
可通过kill命令，也可通过ES的API停止。如果有集群(多于一个节点)，最好通过API停止。

$> kill <PID>

#集群通过ES的API停止

#关闭所有节点
$> curl -XPOST ‘http://localhost:9200/_cluster/nodes/_shutdown’

#关闭本地节点
$> curl -XPOST 'http://localhost:9200/_cluster/nodes/_local/_shutdown'

集群重启
当Elasticsearch需要升级或更改配置时，需要重启Elasticsearch集群。若希望整个集群持续提供服务则需要进行Rolling restart逐个节点进行重启，但逐个重启Elasticsearch节点会造成Elasticsearch对分片的重新分配，这样会带来很大的IO和带宽压力而且由于重启的时间会异常漫长(需要重新计算分片等)。因此在使用Rolling restart前需要关闭分片自动分配机制
1. 关闭分片自动分配机制
  1
  2
  3
  4
  5
  curl -XPUT 'http://localhost:9200/_cluster/settings' -d '{
  "transient" : {
  "cluster.routing.allocation.enable" : "none"
  }
  }'
2. 集群逐个节点进行操作：停止Elasticsearch、修改配置/升级Elasticsearch、启动Elasticsearch
3. 待所有节点都配置完启动成功后，切记要重新开启分片自动分配机制
  1
  2
  3
  4
  5
  curl -XPUT 'http://localhost:9200/_cluster/settings' -d '{
  "transient" : {
  "cluster.routing.allocation.enable" : "all"
  }
  }'

集群恢复优化
若集群由于特殊原因全部停止而需要重新启动时，则需要考虑节点新增时分片的频繁变动导致的问题。需要对集群恢复做优化配置，具体参考此文档

#每个节点添加配置(假设有10个节点)

# 集群中的N个节点启动后,才允许进行恢复处理
gateway.recover_after_nodes: 8

# 设置初始化恢复过程的超时时间,超时时间从上一个配置中配置的N个节点启动后算起
gateway.recover_after_time: 5m

## 设置这个集群中期望有多少个节点.一旦这N个节点启动(并且recover_after_nodes也符合),立即开始恢复过程(不等待recover_after_time超时)
gateway.expected_nodes: 10

#minimum_master_nodes: 2

以上配置表示：至少等待8个节点上线。8个节点上线后，等待5分钟，或者10个节点上线后，才进行数据恢复，这取决于哪个条件先达到。

修复unassigned shards

当集群中出现unassigned分片时，可尝试修复分片使得集群重新回到健康状态。
先查出节点的唯一标识

curl 'http://localhost:9200/_nodes/process?pretty'

{
  "cluster_name" : "elk-cluster",
  "nodes" : {
    "bHimoXRrRSSob5D6txh5Ww" : {
      "name" : "192.168.241.66",
      "transport_address" : "10.255.2.66:9300",
      "host" : "10.255.2.66",
      "ip" : "10.255.2.66",
      "version" : "2.3.5",
      "build" : "90f439f",
...

然后运行以下脚本尝试修复

for index in $(curl  -s 'http://localhost:9200/_cat/shards' | grep UNASSIGNED | awk '{print $1}' | sort | uniq); do
    for shard in $(curl -s 'http://localhost:9200/_cat/shards' | grep UNASSIGNED | grep $index | awk '{print $2}' | sort | uniq); do
        echo  $index $shard
        curl -XPOST 'http://localhost:9200/_cluster/reroute' -d '{
            "commands" : [ {
                  "allocate" : {
                      "index" : "'$index'",
                      "shard" : "'$shard'",
                      "node" : "bHimoXRrRSSob5D6txh5Ww",
                      "allow_primary" : true
                  }
                }
            ]
        }'
        sleep 5
    done
done

Elasticsearch API

查看集群健康度
1
curl -XGET http://localhost:9200/_cluster/health?pretty
需要重点关注status
- green：主分片和副本分片都可用
- yellow：所有主分片可用，副本分片存在异常。当集群节点只有一个时，集群status为yellow，因为只有一个节点副本分片无法分配。
- red：存在不可用的主分片。当Elasticsearch在启动时会检查主分片，有时会因为数据量较大需要较长时间。只需等所有主分片都检测加载且没问题后status会自动变化。在启动Elasticsearch时最好调用命令查看状态，关注集群启动情况。
查看所有索引大小及状态
查看所有索引的大小和状态(open/close)
1
curl 'http://localhost:9200/_cat/indices?bytes=kb'

删除索引

1	$> curl -XDELETE 'http://localhost:9200/logstash-2015.*'

插入数据

$> curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
	"user" : "kimchy",
	"post_date" : "2009-11-15T14:12:12",
	"message" : "trying out Elastic Search"
}'

迁移索引

curl -XPOST 127.0.0.1:9200/_cluster/reroute -d '{
    "commands": [{
        "move": {
            "index": "index-name-2016.12.29",
            "shard": 1,
            "from_node": "10.201.3.33",
            "to_node": "10.201.3.30"
        }
      }
    ]
}'

Elasticsearch 优化配置

1 2	ES_MIN_MEM ES_MAX_MEM

修改文件/usr/local/elasticsearch/bin/elasticsearch.in.sh
内存在32G以内的系统中，尽可能的给大，两个值保持一致。

1 2	indices.fielddata.cache.size: 75% indices.breaker.fielddata.limit: 85%

indices.fielddata.cache.size
设置字段缓存大小。当对字段进行排序或聚合时，会将使用到的字段都加载进内存以提高访问速度。将字段都载入内存非常消耗资源，故应确保fielddata.cache的大小足够大以保证结果能被缓存。ES默认缓存大小为无限大，之所以要设置缓存大小是为了防止缓存数据过大导致OOM。
indices.breaker.fielddata.limit
限制字段缓存大小。设置此值为了防止查询缓存字段过大。若需要加载到内存的缓存结果大于indices.fielddata.cache.size且小于indices.breaker.fielddata.limit，ES会接受查询并缓存结果。但数据量大于indices.breaker.fielddata.limitES便会拒绝查询并抛出异常。

indices.breaker.fielddata.limit和indices.breaker.fielddata.limit的关系类似于soft nofile和hard nofile。

1	threadpool.search.queue_size: 10000

控制待处理请求队列大小。当Kibana需要同时查询多个数据或查询数据量比较大时需要增大此值以满足需求。

1	bootstrap.mlockall: true

不让JVM写入SWAP，避免降低ES的性能

Curator

使用Curator对旧索引进行close/open/delete。
对于旧索引使用率不高但占用大量资源，对旧索引进行close或delete有利于优化Elasticsearch集群。

安装
1
pip install elasticsearch-curator

使用

curator --timeout 36000 --host localhost close indices --time-unit days --timestring '%Y.%m.%d' --prefix test1-
curator --timeout 36000 --host localhost close indices --older-than 30 --time-unit days --timestring '%Y.%m.%d' --prefix test2_

curator --timeout 36000 --host localhost optimize --max_num_segments 1 indices --older-than 1 --newer-than 7 --time-unit days --timestring '%Y.%m.%d'

关闭前缀为test1-的所有索引
关闭前缀为test2_且旧于30天的索引

Elasticdump

Elasticdump用来对Elasticsearch中的数据进行迁移备份等操作。

安装NodeJS
Elasticdump是用NodeJS写的，所以依赖NodeJS环境，需要安装NodeJS。

curl --silent --location https://rpm.nodesource.com/setup_4.x | bash -
yum install -y gcc-c++ make nodejs

export NODE_PATH=/usr/lib/node_modules

安装Elasticdump
1
npm install elasticdump@2.1.0 -g

使用

备份
备份成文件并压缩

1	elasticdump --input=http://localhost:9200/ --output=$ \| gzip > /data/elasticsearch_json.gz

导出数据
Elasticsearch导出需要分成mapping和data两部分。

1 2	elasticdump --input=http://127.0.0.1:9200/.kibana --output=/kibana_mapping.json --type=mapping elasticdump --input=http://127.0.0.1:9200/.kibana --output=/kibana_data.json --type=data

导入数据

1 2	elasticdump --input=kibana_mapping.json --output=http://127.0.0.1:9200/.kibana --type=mapping elasticdump --input=kibana_data.json --output=http://127.0.0.1:9200/.kibana --type=data

关于ssl证书
当需要通过https访问Elasticsearch时，若SSL证书是自制的则会警告并失败。这个由于NodeJS对自制SSL证书的设置，可设置忽略该告警
1
export NODE_TLS_REJECT_UNAUTHORIZED=0

EFK/ELK