Elasticsearch文件存储

分析Elasticsearch Index文件是如何存储的?
主要是想看一下FST文件是以什么粒度创建的?

首先通过kibana找一个索引的shard,此处咱们就以logstash-2023.05.30索引为例

查看下shard分布情况

GET /_cat/shards/logstash-2023.05.30?v


index               shard prirep state      docs   store ip             node
logstash-2023.05.30 3     p      STARTED 1520736 408.1mb 10.138.40.73  10.138.40.73-node1
logstash-2023.05.30 5     p      STARTED 1520888 409.9mb 10.138.40.74  10.138.40.74-node1
logstash-2023.05.30 6     p      STARTED 1518331 408.2mb 10.138.40.221 10.138.40.221-node1
logstash-2023.05.30 4     p      STARTED 1518186 409.3mb 10.138.204.194 10.138.204.194-node1
logstash-2023.05.30 1     p      STARTED 1519231 408.8mb 10.138.40.220 10.138.40.220-node1
logstash-2023.05.30 2     p      STARTED 1519970 409.9mb 10.138.204.195 10.138.204.195-node1
logstash-2023.05.30 0     p      STARTED 1520024 410.6mb 10.138.204.193 10.138.204.193-node1

这里以位于10.138.204.193上的shard 0为例分析。

要找到存储目录先要找到index的id

GET /logstash-2023.05.30/_settings

{
  "logstash-2023.05.30" : {
    "settings" : {
      "index" : {
        "codec" : "best_compression",
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "refresh_interval" : "60s",
        "number_of_shards" : "7",
        "provided_name" : "logstash-2023.05.30",
        "creation_date" : "1685376005206",
        "number_of_replicas" : "0",
        "uuid" : "FYWtFGTIS2CLB8yJhFXG9g",//这里就是索引的id
        "version" : {
          "created" : "7130499"
        }
      }
    }
  }
}

登录机器,找到存储索引文件的对应目录

/data3/10.138.204.193-node1/nodes/0/indices/FYWtFGTIS2CLB8yJhFXG9g

展开一下该目录下的文件

root@prd-paas-es-01:/data3/10.138.204.193-node1/nodes/0/indices/FYWtFGTIS2CLB8yJhFXG9g# tree -C -s
.
├── [       4096]  0
│   ├── [      20480]  index
│   │   ├── [        158]  _17f.fdm
│   │   ├── [   25578562]  _17f.fdt
│   │   ├── [       1939]  _17f.fdx
│   │   ├── [       4636]  _17f.fnm
│   │   ├── [    7981735]  _17f.kdd
│   │   ├── [      20898]  _17f.kdi
│   │   ├── [        716]  _17f.kdm
│   │   ├── [    7945983]  _17f_Lucene80_0.dvd
│   │   ├── [       3916]  _17f_Lucene80_0.dvm
│   │   ├── [    6230127]  _17f_Lucene84_0.doc
│   │   ├── [    3875001]  _17f_Lucene84_0.pos
│   │   ├── [    7448815]  _17f_Lucene84_0.tim
│   │   ├── [     108786]  _17f_Lucene84_0.tip
│   │   ├── [       1637]  _17f_Lucene84_0.tmd
│   │   ├── [        593]  _17f.si
│   │   ├── [        158]  _3uv.fdm
│   │   ├── [   33652243]  _3uv.fdt
│   │   ├── [       2555]  _3uv.fdx
│   │   ├── [       4636]  _3uv.fnm
│   │   ├── [   10520395]  _3uv.kdd
│   │   ├── [      27689]  _3uv.kdi
│   │   ├── [        716]  _3uv.kdm
│   │   ├── [   10573208]  _3uv_Lucene80_0.dvd
│   │   ├── [       3916]  _3uv_Lucene80_0.dvm
│   │   ├── [    8298061]  _3uv_Lucene84_0.doc
│   │   ├── [    5154427]  _3uv_Lucene84_0.pos
│   │   ├── [    9716222]  _3uv_Lucene84_0.tim
│   │   ├── [     142063]  _3uv_Lucene84_0.tip
│   │   ├── [       1620]  _3uv_Lucene84_0.tmd
│   │   ├── [        593]  _3uv.si
│   │   ├── [        158]  _5bg.fdm
│   │   ├── [   16433011]  _5bg.fdt
│   │   ├── [       1259]  _5bg.fdx
│   │   ├── [       4636]  _5bg.fnm
│   │   ├── [    5158094]  _5bg.kdd
│   │   ├── [      13396]  _5bg.kdi
│   │   ├── [        716]  _5bg.kdm
│   │   ├── [    5140762]  _5bg_Lucene80_0.dvd
│   │   ├── [       3916]  _5bg_Lucene80_0.dvm
│   │   ├── [    4005897]  _5bg_Lucene84_0.doc
│   │   ├── [    2583880]  _5bg_Lucene84_0.pos
│   │   ├── [    4873082]  _5bg_Lucene84_0.tim
│   │   ├── [      70979]  _5bg_Lucene84_0.tip
│   │   ├── [       1593]  _5bg_Lucene84_0.tmd
│   │   ├── [        593]  _5bg.si
│   │   ├── [        158]  _60h.fdm
│   │   ├── [   24664753]  _60h.fdt
│   │   ├── [       1886]  _60h.fdx
│   │   ├── [       4636]  _60h.fnm
│   │   ├── [    7640438]  _60h.kdd
│   │   ├── [      19996]  _60h.kdi
│   │   ├── [        716]  _60h.kdm
│   │   ├── [    7754954]  _60h_Lucene80_0.dvd
│   │   ├── [       3916]  _60h_Lucene80_0.dvm
│   │   ├── [    6147241]  _60h_Lucene84_0.doc
│   │   ├── [    3998559]  _60h_Lucene84_0.pos
│   │   ├── [    7254035]  _60h_Lucene84_0.tim
│   │   ├── [     105673]  _60h_Lucene84_0.tip
│   │   ├── [       1719]  _60h_Lucene84_0.tmd
│   │   ├── [        593]  _60h.si
│   │   ├── [        200]  _7jq.fdm
│   │   ├── [   63208093]  _7jq.fdt
│   │   ├── [       4692]  _7jq.fdx
│   │   ├── [       4636]  _7jq.fnm
│   │   ├── [   19306117]  _7jq.kdd
│   │   ├── [      51562]  _7jq.kdi
│   │   ├── [        716]  _7jq.kdm
│   │   ├── [   20228561]  _7jq_Lucene80_0.dvd
│   │   ├── [       3916]  _7jq_Lucene80_0.dvm
│   │   ├── [   15606568]  _7jq_Lucene84_0.doc
│   │   ├── [    9581341]  _7jq_Lucene84_0.pos
│   │   ├── [   17383473]  _7jq_Lucene84_0.tim
│   │   ├── [     272615]  _7jq_Lucene84_0.tip
│   │   ├── [       1592]  _7jq_Lucene84_0.tmd
│   │   ├── [        593]  _7jq.si
│   │   ├── [        437]  _82w.cfe
│   │   ├── [    4489379]  _82w.cfs
│   │   ├── [        408]  _82w.si
│   │   ├── [        437]  _87w.cfe
│   │   ├── [    4932636]  _87w.cfs
│   │   ├── [        408]  _87w.si
│   │   ├── [        437]  _8ao.cfe
│   │   ├── [   13905317]  _8ao.cfs
│   │   ├── [        408]  _8ao.si
│   │   ├── [        437]  _8ls.cfe
│   │   ├── [   20181047]  _8ls.cfs
│   │   ├── [        408]  _8ls.si
│   │   ├── [        437]  _8nq.cfe
│   │   ├── [    1234712]  _8nq.cfs
│   │   ├── [        408]  _8nq.si
│   │   ├── [        437]  _8oa.cfe
│   │   ├── [     872798]  _8oa.cfs
│   │   ├── [        408]  _8oa.si
│   │   ├── [        437]  _8pp.cfe
│   │   ├── [    1593677]  _8pp.cfs
│   │   ├── [        408]  _8pp.si
│   │   ├── [        437]  _8r5.cfe
│   │   ├── [     914008]  _8r5.cfs
│   │   ├── [        408]  _8r5.si
│   │   ├── [        437]  _8rf.cfe
│   │   ├── [     940473]  _8rf.cfs
│   │   ├── [        408]  _8rf.si
│   │   ├── [        437]  _8rz.cfe
│   │   ├── [    1315312]  _8rz.cfs
│   │   ├── [        408]  _8rz.si
│   │   ├── [        437]  _8s9.cfe
│   │   ├── [    1121692]  _8s9.cfs
│   │   ├── [        408]  _8s9.si
│   │   ├── [        437]  _8sk.cfe
│   │   ├── [     243476]  _8sk.cfs
│   │   ├── [        408]  _8sk.si
│   │   ├── [       1678]  segments_6
│   │   └── [          0]  write.lock
│   ├── [       4096]  _state
│   │   ├── [        186]  retention-leases-2865.st
│   │   └── [        125]  state-0.st
│   └── [       4096]  translog
│       ├── [         55]  translog-29.tlog
│       └── [         88]  translog.ckp
└── [       4096]  _state
    └── [       1230]  state-2.st

5 directories, 118 files

有了文件信息,我们再来看下,segment信息

GET /logstash-2023.05.30/_segments

// 这里为了直观 只展示shard 0对应的segment
{
	"_shards": {
		"total": 7,
		"successful": 7,
		"failed": 0
	},
	"indices": {
		"logstash-2023.05.30": {
			"shards": {
				"0": [
					{
						"routing": {
							"state": "STARTED",
							"primary": true,
							"node": "4hEWcF8hRFWTEkQxlKQmqg"
						},
						"num_committed_segments": 17,
						"num_search_segments": 17,
						"segments": {
							"_17f": {
								"generation": 1563,
								"num_docs": 210331,
								"deleted_docs": 0,
								"size_in_bytes": 59203502,
								"memory_in_bytes": 5140,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": false,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_3uv": {
								"generation": 4999,
								"num_docs": 278411,
								"deleted_docs": 0,
								"size_in_bytes": 78098502,
								"memory_in_bytes": 5140,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": false,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_5bg": {
								"generation": 6892,
								"num_docs": 132645,
								"deleted_docs": 0,
								"size_in_bytes": 38291972,
								"memory_in_bytes": 5140,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": false,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_60h": {
								"generation": 7793,
								"num_docs": 199809,
								"deleted_docs": 0,
								"size_in_bytes": 57599273,
								"memory_in_bytes": 5140,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": false,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_7jq": {
								"generation": 9782,
								"num_docs": 520420,
								"deleted_docs": 0,
								"size_in_bytes": 145654675,
								"memory_in_bytes": 5204,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": false,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_82w": {
								"generation": 10472,
								"num_docs": 15416,
								"deleted_docs": 0,
								"size_in_bytes": 4490224,
								"memory_in_bytes": 5140,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": true,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_87w": {
								"generation": 10652,
								"num_docs": 16837,
								"deleted_docs": 0,
								"size_in_bytes": 4933481,
								"memory_in_bytes": 5140,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": true,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_8ao": {
								"generation": 10752,
								"num_docs": 48855,
								"deleted_docs": 0,
								"size_in_bytes": 13906162,
								"memory_in_bytes": 5140,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": true,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_8ls": {
								"generation": 11152,
								"num_docs": 70903,
								"deleted_docs": 0,
								"size_in_bytes": 20181892,
								"memory_in_bytes": 5140,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": true,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_8nq": {
								"generation": 11222,
								"num_docs": 3954,
								"deleted_docs": 0,
								"size_in_bytes": 1235557,
								"memory_in_bytes": 6924,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": true,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_8oa": {
								"generation": 11242,
								"num_docs": 2785,
								"deleted_docs": 0,
								"size_in_bytes": 873643,
								"memory_in_bytes": 6820,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": true,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_8pp": {
								"generation": 11293,
								"num_docs": 5194,
								"deleted_docs": 0,
								"size_in_bytes": 1594522,
								"memory_in_bytes": 7060,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": true,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_8r5": {
								"generation": 11345,
								"num_docs": 2936,
								"deleted_docs": 0,
								"size_in_bytes": 914853,
								"memory_in_bytes": 6748,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": true,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_8rf": {
								"generation": 11355,
								"num_docs": 2920,
								"deleted_docs": 0,
								"size_in_bytes": 941318,
								"memory_in_bytes": 6836,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": true,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_8rz": {
								"generation": 11375,
								"num_docs": 4304,
								"deleted_docs": 0,
								"size_in_bytes": 1316157,
								"memory_in_bytes": 6820,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": true,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_8s9": {
								"generation": 11385,
								"num_docs": 3647,
								"deleted_docs": 0,
								"size_in_bytes": 1122537,
								"memory_in_bytes": 6892,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": true,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							},
							"_8sk": {
								"generation": 11396,
								"num_docs": 657,
								"deleted_docs": 0,
								"size_in_bytes": 244321,
								"memory_in_bytes": 7620,
								"committed": true,
								"search": true,
								"version": "8.8.2",
								"compound": true,
								"attributes": {
									"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
								}
							}
						}
					}
				]
			}
		}
	}
}

对比segment与shard目录中文件可以看出,两者是一一对应的。

看下es及对应lucene的版本

GET /

{
  "name" : "10.138.204.193-node1",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "XWDyVuo6TgK4yUp2XWD3lw",
  "version" : {
    "number" : "7.13.4",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "c5f60e894ca0c61cdbae4f5a686d9f08bcefc942",
    "build_date" : "2021-07-14T18:33:36.673943207Z",
    "build_snapshot" : false,
    "lucene_version" : "8.8.2",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

那么shard目录中各种后缀的文件具体是什么含义呢?下面来看下

Elasticsearch文件存储_第1张图片

截图出处:
https://lucene.apache.org/core/8_8_2/core/org/apache/lucene/codecs/lucene87/package-summary.html#package.description

从表格中可以看出与FST相关的文件后缀有:tip、tim,从这里就可以看出FST文件是以segment维度来创建的。

你可能感兴趣的:(ElasticSearch,elasticsearch,大数据,搜索引擎)