Docker镜像存储格式分析

新版本的docker镜像存储其实是很绕的,各种ID和目录定义较多,不是很直观,本文较详细的分析一下镜像本地存储和在registry存储的格式。测试用的docker版本是20.10.9,存储引擎overlay2。

root@ubuntu:/home# docker pull ubuntu
Using default tag: latest
latest: Pulling from library/ubuntu
a39c84e173f0: Pull complete 
Digest: sha256:626ffe58f6e7566e00254b638eb7e0f3b11d4da9675088f4781a50ae288f3322
Status: Downloaded newer image for ubuntu:latest
docker.io/library/ubuntu:latest

root@ubuntu:/home# docker history ubuntu
IMAGE          CREATED        CREATED BY                                      SIZE      COMMENT
d5ca7a445605   34 hours ago   /bin/sh -c #(nop)  CMD ["bash"]                 0B        
      34 hours ago   /bin/sh -c #(nop) ADD file:ff4909f2124325dac…   65.6MB   

root@ubuntu:/home# docker image inspect ubuntu
   "GraphDriver": {
            "Data": {
                "MergedDir": "/var/lib/docker/overlay2/492c0894bbb23c44e42da84b6eeefb75a92895ff84b5360257e02cd8e1fa2bae/merged",
                "UpperDir": "/var/lib/docker/overlay2/492c0894bbb23c44e42da84b6eeefb75a92895ff84b5360257e02cd8e1fa2bae/diff",
                "WorkDir": "/var/lib/docker/overlay2/492c0894bbb23c44e42da84b6eeefb75a92895ff84b5360257e02cd8e1fa2bae/work"
            },
            "Name": "overlay2"
        },
        "RootFS": {
            "Type": "layers",
            "Layers": [
                "sha256:350f36b271dee3d47478fbcd72b98fed5bbcc369632f2d115c3cb62d784edaec"
            ]
        },

为了方便分析镜像层级结构,我们基于ubuntu加两个文件,使用 docker build -t myubuntu .创建一个新镜像myubuntu。

## myubuntu.Dockerfile ##
FROM ubuntu:latest
ADD one.txt /tmp/
ADD two.txt /home/

概览

镜像元数据存储在 /var/lib/docker/image/overlay2

root@ubuntu:/var/lib/docker/image/overlay2# ls -ls
total 16
4 drwx------ 4 root root 4096 Oct 17 10:55 distribution
4 drwx------ 4 root root 4096 Oct 17 10:21 imagedb
4 drwx------ 5 root root 4096 Oct 19 15:33 layerdb
4 -rw------- 1 root root  379 Oct 19 15:43 repositories.json

root@ubuntu:/var/lib/docker/image/overlay2# cat repositories.json |jq .
{
  "Repositories": {
    "myubuntu": {
      "myubuntu:latest": "sha256:0a95d0866cb574d513d22e62014898de8c0108df9c1ec25a75a0dc2cbbaed16f"
    },
    "ubuntu": {
      "ubuntu:latest": "sha256:d5ca7a4456053674d490803005766890dd19e3f7e789a48737c0d462da531f5d",
      "ubuntu@sha256:626ffe58f6e7566e00254b638eb7e0f3b11d4da9675088f4781a50ae288f3322": "sha256:d5ca7a4456053674d490803005766890dd19e3f7e789a48737c0d462da531f5d"
    }
  }
}

  • distribution 是registry用的,docker pull和push会用到里面元数据。
  • imagedb存储镜像元数据。
  • layerdb存储layer元数据。
  • repositories.json 则是镜像的仓库以及版本信息。ubuntu对应了两条信息,其中ubuntu@sha256对应的是镜像仓库中的manifest文件的digest。myubuntu是基于ubuntu另外加了两层的新镜像,因为没有push到仓库,所以只有一条记录。

imagedb目录

imagedb目录存储的是镜像元数据。

  • content主要是镜像元数据的Json文件。
  • metadata目录存储的是父镜像的ImageID,因为d5ca7a...没有父镜像,所以该目录下没有信息。
├── content
│   └── sha256
│       ├── 0a95d0866cb574d513d22e62014898de8c0108df9c1ec25a75a0dc2cbbaed16f
│       ├── 7b8fd4f52df342ac54019ffe8db2275f44e7c61f88130def5b5baa8ba85572b7
│       └── d5ca7a4456053674d490803005766890dd19e3f7e789a48737c0d462da531f5d
└── metadata
    └── sha256
        ├── 0a95d0866cb574d513d22e62014898de8c0108df9c1ec25a75a0dc2cbbaed16f
        │   ├── lastUpdated
        │   └── parent
        └── 7b8fd4f52df342ac54019ffe8db2275f44e7c61f88130def5b5baa8ba85572b7
            └── parent

Image Json

  • 每个镜像都会有一个对应的Json文件,用于描述镜像的基础信息,比如镜像创建时间、作者以及入口文件,默认参数、网络和数据卷配置等。这个Json文件也存储了镜像的分层记录diff_ids,以及提供对应层的历史信息。
  • 这个Json文件是不可更改的,如果改了会影响ImageID,因为ImageID就是sha256sum(ImageJson)得到的。
  • 如 debian:10 的image id是 d5ca...,它对应的Json文件在 $DOCKER_DIR/image/overlay2/imagedb/content/sha256/d5ca7a4456053674d490803005766,内容如下:
  ...
  "created": "2021-10-16T01:47:45.87597179Z",
  "docker_version": "20.10.7",
  "history": [
    {
      "created": "2021-10-16T01:47:45.455040439Z",
      "created_by": "/bin/sh -c #(nop) ADD file:ff4909f2124325dac58d43c617132325934ed48a5ab4c534d05f931fcf700a2f in / "
    },
    {
      "created": "2021-10-16T01:47:45.87597179Z",
      "created_by": "/bin/sh -c #(nop)  CMD [\"bash\"]",
      "empty_layer": true
    }
  ],
  "os": "linux",
  "rootfs": {
    "type": "layers",
    "diff_ids": [
      "sha256:350f36b271dee3d47478fbcd72b98fed5bbcc369632f2d115c3cb62d784edaec"
    ]
  },
  "variant": "v8"
}

Image ID

镜像ID是从Image Json文件得到的,即 sha256sum(ImageJson)。

root@ubuntu:/home# sha256sum /var/lib/docker/image/overlay2/imagedb/content/sha256/d5ca7a4456053674d490803005766890dd19e3f7e789a48737c0d462da531f5d 
d5ca7a4456053674d490803005766890dd19e3f7e789a48737c0d462da531f5d  /var/lib/docker/image/overlay2/imagedb/content/sha256/d5ca7a4456053674d490803005766890dd19e3f7e789a48737c0d462da531f5d

layerdb目录

Layer DiffID

Layer DiffID是没有压缩的对应层的tar文件的sha256sum值。当然,打包和截包文件的适合得保证是可以可重复操作的,不然会导致Layer的DiffID出错(可以使用tar-split保存tar headers)。比如上面例子中ubuntu镜像只有一层,diff_ids列表只有 (history里面的CMD不占磁盘空间) "sha256:350f36b271dee3d47478fbcd72b98fed5bbcc369632f2d115c3cb62d784edaec"。每一层的tar文件我们可以通过 docker save ubuntu -o ubuntu.tar得到,解压ubuntu.tar后可以得到对应的层级的打包后的layer.tar文件,如下可以验证DiffID的值计算原理。

root@ubuntu:/home/ubuntutar/3354e7edac2389f01e81eb3372e6c8239af07d389d703b0b08ea347b218ff6d6# sha256sum layer.tar 
350f36b271dee3d47478fbcd72b98fed5bbcc369632f2d115c3cb62d784edaec  layer.tar

Layer ChainID

ChainID用于标识镜像层级栈,它的值由DiffID计算得来。ChainID对应的layer目录是 /var/lib/docker/image/overlay2/layerdb/sha256,这下面的目录就是ChainID,其中内容存储了镜像的层级栈关系。比如这一层的parent是什么,以及对应的镜像数据存储目录的cache_id。本身layerdb只是存储layer的元数据信息,并不存储实际镜像数据。

ChainID(L₀) =  DiffID(L₀)
ChainID(L₀|...|Lₙ₋₁|Lₙ) = Digest(ChainID(L₀|...|Lₙ₋₁) + " " + DiffID(Lₙ))

即最底层的ChainID跟DiffID一样,而其他层的ChainID则是通过计算从最底层到这层的Digest得到。

root@ubuntu:/home/ssj/dockerfiles# docker images
REPOSITORY   TAG       IMAGE ID       CREATED         SIZE
myubuntu     latest    0a95d0866cb5   2 seconds ago   65.6MB
ubuntu       latest    d5ca7a445605   3 days ago      65.6MB

root@ubuntu:/var/lib/docker/image/overlay2/imagedb# docker history myubuntu
IMAGE          CREATED      CREATED BY                                      SIZE      COMMENT
0a95d0866cb5   4 days ago   /bin/sh -c #(nop) ADD file:8d08d699a25a24091…   4B        
7b8fd4f52df3   4 days ago   /bin/sh -c #(nop) ADD file:5ff37a8c444d9cf17…   4B        
d5ca7a445605   8 days ago   /bin/sh -c #(nop)  CMD ["bash"]                 0B        
      8 days ago   /bin/sh -c #(nop) ADD file:ff4909f2124325dac…   65.6MB 

查看Image Json文件可以看到myubuntu相比ubuntu的diff_ids加了两层,imagedb目录下面也多了两个镜像元数据信息文件,对应新增加的两层镜像。

root@ubuntu:/var/lib/docker/image/overlay2/imagedb/content/sha256# cat 0a95d0866cb574d513d22e62014898de8c0108df9c1ec25a75a0dc2cbbaed16f
  "rootfs": {
    "type": "layers",
    "diff_ids": [
      "sha256:350f36b271dee3d47478fbcd72b98fed5bbcc369632f2d115c3cb62d784edaec",
      "sha256:7c7eb5781271639891432f506fce3b30b74c63f0b145ad7746a7e01284e4f7a2",
      "sha256:a58f164385b2d99773a41596a257920d90c6900c9f74d6a22a633d78f9c8424e"
    ]
  },
  
root@ubuntu:/var/lib/docker/image/overlay2/imagedb/content/sha256# ls -ls
total 12
4 -rw------- 1 root root 1927 Oct 19 15:43 0a95d0866cb574d513d22e62014898de8c0108df9c1ec25a75a0dc2cbbaed16f
4 -rw------- 1 root root 1689 Oct 19 15:43 7b8fd4f52df342ac54019ffe8db2275f44e7c61f88130def5b5baa8ba85572b7
4 -rw------- 1 root root 1475 Oct 17 11:40 d5ca7a4456053674d490803005766890dd19e3f7e789a48737c0d462da531f5d

除了之前的350f...对应ubuntu,其中7c7e是one.txt那层,而a58f则是two.txt那层。

root@ubuntu:/var/lib/docker/image/overlay2/layerdb/sha256# ls -ls
total 12
4 drwx------ 2 root root 4096 Oct 19 15:43 06e00d189a99510f2ee2bfc4b6eed7b4d119adc4514acbfc13efc16e6b482a3d
4 drwx------ 2 root root 4096 Oct 17 11:40 350f36b271dee3d47478fbcd72b98fed5bbcc369632f2d115c3cb62d784edaec
4 drwx------ 2 root root 4096 Oct 19 15:43 850bf45b4ce3aa79e125f8bf8142bc760506a854e8ac2c42b5fc343be8099097
ChainID(350f) = DiffID(350f) = 350f...
ChainID(7c7e) = Digest(DiffID(350f) + DiffID(7c7e)) = 06e0...
ChainID(a58f) = Digest(ChainID(7c7e) + DiffID(a58f)) = 850b...
# echo -n 'sha256:350f36b271dee3d47478fbcd72b98fed5bbcc369632f2d115c3cb62d784edaec sha256:7c7eb5781271639891432f506fce3b30b74c63f0b145ad7746a7e01284e4f7a2' > /tmp/chain1 
# sha256sum /tmp/chain1 
06e00d189a99510f2ee2bfc4b6eed7b4d119adc4514acbfc13efc16e6b482a3d  /tmp/chain1

# echo -n 'sha256:06e00d189a99510f2ee2bfc4b6eed7b4d119adc4514acbfc13efc16e6b482a3d sha256:a58f164385b2d99773a41596a257920d90c6900c9f74d6a22a633d78f9c8424e' > /tmp/chain2
# sha256sum /tmp/chain2 
850bf45b4ce3aa79e125f8bf8142bc760506a854e8ac2c42b5fc343be8099097  /tmp/chain2

我们可以看下ChainID目录下的内容,可以看到除了ubuntu的基础层,其他层都有一个文件parent,值就是父层的diff_id,diff是diff_id值,cache-id则是镜像实际存储目录,位于 /var/lib/docker/overlay2/{cache-id},size是这一层实际增加文件的大小,tar-split.json.gz是打包这层镜像的配置(参考 https://github.com/vbatts/tar-split)。

root@ubuntu:/var/lib/docker/image/overlay2/layerdb/sha256# tree
.
├── 06e00d189a99510f2ee2bfc4b6eed7b4d119adc4514acbfc13efc16e6b482a3d
│   ├── cache-id
│   ├── diff
│   ├── parent
│   ├── size
│   └── tar-split.json.gz
├── 350f36b271dee3d47478fbcd72b98fed5bbcc369632f2d115c3cb62d784edaec
│   ├── cache-id
│   ├── diff
│   ├── size
│   └── tar-split.json.gz
└── 850bf45b4ce3aa79e125f8bf8142bc760506a854e8ac2c42b5fc343be8099097
    ├── cache-id
    ├── diff
    ├── parent
    ├── size
    └── tar-split.json.gz

overlay2目录

CacheID

CacheID是一个uuid值,每次都不一样。它对应的目录 /var/lib/docker/overlay2/${CacheID},该目录存储了镜像每层文件和下一层的链接等。验证一下:

root@ubuntu:/var/lib/docker/image/overlay2/layerdb/sha256# cat 06e00d189a99510f2ee2bfc4b6eed7b4d119adc4514acbfc13efc16e6b482a3d/cache-id 
743685bc1d83ec4165681442dcb5f19107406be3b28b479860a5ff13414ed49

root@ubuntu:/var/lib/docker/overlay2/743685bc1d83ec4165681442dcb5f19107406be3b28b479860a5ff13414ed497# tree
.
├── committed
├── diff
│   └── tmp
│       └── one.txt
├── link
├── lower
└── work

root@ubuntu:/var/lib/docker/overlay2/743685bc1d83ec4165681442dcb5f19107406be3b28b479860a5ff13414ed497# cat link 
EV2GF3SYGZFG7HYF7THVYGLOBQ

root@ubuntu:/var/lib/docker/overlay2/743685bc1d83ec4165681442dcb5f19107406be3b28b479860a5ff13414ed497# cat lower 
l/F3YS2JA2OLBGKAAGJCQRSGCQZ3

root@ubuntu:/var/lib/docker/overlay2# ls -ls l/
total 12
4 lrwxrwxrwx 1 root root 72 Oct 19 15:43 EV2GF3SYGZFG7HYF7THVYGLOBQ -> ../743685bc1d83ec4165681442dcb5f19107406be3b28b479860a5ff13414ed497/diff
4 lrwxrwxrwx 1 root root 72 Oct 17 11:40 F3YS2JA2OLBGKAAGJCQRSGCQZ3 -> ../492c0894bbb23c44e42da84b6eeefb75a92895ff84b5360257e02cd8e1fa2bae/diff
4 lrwxrwxrwx 1 root root 72 Oct 19 15:43 P63ZKOO3TBJQCZE65OB7QEMKYZ -> ../c4ba312d8d454a3b0e7f74d985b086d688cacc5e00a995db482cf030f10729d5/diff

其中diff目录下面便是这一层的文件。link是对应的diff目录的短链接,lower则是下一层diff目录的短链接。

镜像大小

在layerdb目录的size文件可以看到每一层的镜像大小,但是加起来跟docker images显示的大小会有点差距,比如myubuntu镜像每一层计算加起来是 4 + 65593591 + 4 = 65593599 / 1024 / 1024 = 62.55MiB,实际显示是 65.6MB,这是因为docker images里面计算是按 65593599/1000/1000=65.59 计算的。

另外,因为镜像每层记录的是相对前一层的文件变化,即便删除了文件和软件包,新镜像大小也不会变小。除非使用 docker export 重新导出一个新镜像。

distribution目录

这个目录存储的是layer diffid和digest的关系,其中digest是镜像仓库里面的目录ID。

root@ubuntu:/var/lib/docker/image/overlay2/distribution# tree 
.
├── diffid-by-digest
│   └── sha256
│       └── a39c84e173f038958d338f55a9e8ee64bb6643e8ac6ae98e08ca65146e668d86
└── v2metadata-by-diffid
    └── sha256
        ├── 350f36b271dee3d47478fbcd72b98fed5bbcc369632f2d115c3cb62d784edaec

其中 v2metadata-by-diff存储了layer diffid对应在镜像仓库的信息,包括digest,sourcerepository等。

root@ubuntu:/var/lib/docker/image/overlay2/distribution# cat v2metadata-by-diffid/sha256/350f36b271dee3d47478fbcd72b98fed5bbcc369632f2d115c3cb62d784edaec |jq .
[
  {
    "Digest": "sha256:a39c84e173f038958d338f55a9e8ee64bb6643e8ac6ae98e08ca65146e668d86",
    "SourceRepository": "docker.io/library/ubuntu",
    "HMAC": ""
  }
]

diffid-by-digest则是反过来的,文件名是仓库里面的layer digest,内容是layer diffid。因为我们新加的镜像myubuntu并没有push到仓库,所以这个目录下面没有信息。

我们创建一个本地的registry,然后对myubuntu另外打个tag,docker tag myubuntu 127.0.0.1:5000/myubuntu,则respositories.json会多一条新的记录。此时,distribution目录还没有变化。

root@ubuntu:/var/lib/docker/image/overlay2/distribution# cat ../repositories.json |jq .
{
  "Repositories": {
    "127.0.0.1:5000/myubuntu": {
      "127.0.0.1:5000/myubuntu:latest": "sha256:0a95d0866cb574d513d22e62014898de8c0108df9c1ec25a75a0dc2cbbaed16f"
    },
    "myubuntu": {
      "myubuntu:latest": "sha256:0a95d0866cb574d513d22e62014898de8c0108df9c1ec25a75a0dc2cbbaed16f"
    },
    "registry": {
      "registry:2": "sha256:979f2f24c32b2553fa72c6589287d88c57241a139992bf73e3feadd7cf607cf8",
      "registry@sha256:265d4a5ed8bf0df27d1107edb00b70e658ee9aa5acb3f37336c5a17db634481e": "sha256:979f2f24c32b2553fa72c6589287d88c57241a139992bf73e3feadd7cf607cf8"
    },
    "ubuntu": {
      "ubuntu:latest": "sha256:d5ca7a4456053674d490803005766890dd19e3f7e789a48737c0d462da531f5d",
      "ubuntu@sha256:626ffe58f6e7566e00254b638eb7e0f3b11d4da9675088f4781a50ae288f3322": "sha256:d5ca7a4456053674d490803005766890dd19e3f7e789a48737c0d462da531f5d"
    }
  }
}

当我们执行 docker push 127.0.0.1:5000/myubuntu,则会发现distribution的两个目录分别多了两条记录,对应的是myubuntu的7c7e...a58f...两个diffid。此时repository.json里面127.0.0.1:5000/myubuntu会多一条带digest的记录,这就是镜像仓库里面对应的digest。其中push时显示的digest:sha256:bb3c...对应的是registry的manifest文件的digest,下一节分析。

root@ubuntu:/home/ssj# docker push 127.0.0.1:5000/myubuntu
Using default tag: latest
The push refers to repository [127.0.0.1:5000/myubuntu]
a58f164385b2: Pushed 
7c7eb5781271: Pushed 
350f36b271de: Pushed 
latest: digest: sha256:bb3c3d9d84b6b39e08f9c18ddba9572d2168bf7e7eb656b7bbc1458cbe07220d size: 943

root@ubuntu:/var/lib/docker/image/overlay2/distribution# tree
.
├── diffid-by-digest
│   └── sha256
│       ├── a362e9896a95c6c3b46f3f958cf378bb818552890669d83e44cc86f2a2c4a726
│       ├── fba52c366bc01edecaf465ef739f5230b3034d01f8fcfe16f005929b2fd0e432
           ......
└── v2metadata-by-diffid
    └── sha256
       ├── 7c7eb5781271639891432f506fce3b30b74c63f0b145ad7746a7e01284e4f7a2
       ├── a58f164385b2d99773a41596a257920d90c6900c9f74d6a22a633d78f9c8424e
         ......
         
root@ubuntu:/var/lib/docker/image/overlay2/distribution# cat ../repositories.json |jq .
{
  "Repositories": {
    "127.0.0.1:5000/myubuntu": {
      "127.0.0.1:5000/myubuntu:latest": "sha256:0a95d0866cb574d513d22e62014898de8c0108df9c1ec25a75a0dc2cbbaed16f",
      "127.0.0.1:5000/myubuntu@sha256:bb3c3d9d84b6b39e08f9c18ddba9572d2168bf7e7eb656b7bbc1458cbe07220d": "sha256:0a95d0866cb574d513d22e62014898de8c0108df9c1ec25a75a0dc2cbbaed16f"
    },
   ......

registry存储

registry中存储的镜像文件是经过gzip压缩,比如myubuntu本地大小是65.6MB,推到registry压缩后大小约27MB,压缩比还是不错的。registry存储目录如下,主要分为blobs和repositories两个目录。

  • repositories:存储的镜像元数据。
  • blobs:存储的镜像文件打包后的gzip压缩文件,即layer.tar.gz。
root@ubuntu:/home/registry/docker/registry/v2# tree
.
├── blobs
│   └── sha256
│       ├── 0a
│       │   └── 0a95d0866cb574d513d22e62014898de8c0108df9c1ec25a75a0dc2cbbaed16f
│       │       └── data
│       ├── a3
│       │   ├── a362e9896a95c6c3b46f3f958cf378bb818552890669d83e44cc86f2a2c4a726
│       │   │   └── data
│       │   └── a39c84e173f038958d338f55a9e8ee64bb6643e8ac6ae98e08ca65146e668d86
│       │       └── data
│       ├── bb
│       │   └── bb3c3d9d84b6b39e08f9c18ddba9572d2168bf7e7eb656b7bbc1458cbe07220d
│       │       └── data
│       └── fb
│           └── fba52c366bc01edecaf465ef739f5230b3034d01f8fcfe16f005929b2fd0e432
│               └── data
└── repositories
    └── myubuntu
        ├── _layers
        │   └── sha256
        │       ├── 0a95d0866cb574d513d22e62014898de8c0108df9c1ec25a75a0dc2cbbaed16f
        │       │   └── link
        │       ├── a362e9896a95c6c3b46f3f958cf378bb818552890669d83e44cc86f2a2c4a726
        │       │   └── link
        │       ├── a39c84e173f038958d338f55a9e8ee64bb6643e8ac6ae98e08ca65146e668d86
        │       │   └── link
        │       └── fba52c366bc01edecaf465ef739f5230b3034d01f8fcfe16f005929b2fd0e432
        │           └── link
        ├── _manifests
        │   ├── revisions
        │   │   └── sha256
        │   │       └── bb3c3d9d84b6b39e08f9c18ddba9572d2168bf7e7eb656b7bbc1458cbe07220d
        │   │           └── link
        │   └── tags
        │       └── latest
        │           ├── current
        │           │   └── link
        │           └── index
        │               └── sha256
        │                   └── bb3c3d9d84b6b39e08f9c18ddba9572d2168bf7e7eb656b7bbc1458cbe07220d
        │                       └── link
        └── _uploads

repositories目录分析

repositories的一级子目录是镜像名,这里是myubuntu。下面对应三个子目录 _manifests,_layers, _uploads。

  • _manifests: 对应镜像的一些元数据,比如该版本对应的manifest文件的sha256值。其中bb3c.../link的值就是bb3c...,这个值是根据manifest文件的内容sha256sum得到的。manifest文件的各项内容在文档中有说明 https://github.com/opencontainers/image-spec/blob/main/manifest.md#image-manifest-property-descriptions。该文件中的layers存储的digest是各layer的gzip压缩文件的sha256sum值。
  • _layers: 其中sha256目录下的子目录名是imageid和layer压缩后gzip文件的digest值,link文件内容就是digest值本身。
  • _uploads: 目录为空,暂且不讨论。
root@ubuntu:/home/registry/docker/registry/v2/repositories/myubuntu# sha256sum ../../blobs/sha256/bb/bb3c3d9d84b6b39e08f9c18ddba9572d2168bf7e7eb656b7bbc1458cbe07220d/datta 
bb3c3d9d84b6b39e08f9c18ddba9572d2168bf7e7eb656b7bbc1458cbe07220d  ../../blobs/sha256/bb/bb3c3d9d84b6b39e08f9c18ddba9572d2168bf7e7eb656b7bbc1458cbe07220d/data

root@ubuntu:/home/registry/docker/registry/v2/repositories/myubuntu# cat ../../blobs/sha256/bb/bb3c3d9d84b6b39e08f9c18ddba9572d2168bf7e7eb656b7bbc1458cbe07220d/data 
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
   "config": {
      "mediaType": "application/vnd.docker.container.image.v1+json",
      "size": 1927,
      "digest": "sha256:0a95d0866cb574d513d22e62014898de8c0108df9c1ec25a75a0dc2cbbaed16f"
   }, # ImageID
   "layers": [
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 27170900, # 压缩后的blobs/sha256/xx/xxyy/data文件大小
         "digest": "sha256:a39c84e173f038958d338f55a9e8ee64bb6643e8ac6ae98e08ca65146e668d86"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 142,
         "digest": "sha256:fba52c366bc01edecaf465ef739f5230b3034d01f8fcfe16f005929b2fd0e432" # diffid_by_digest目录的文件名
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 140,
         "digest": "sha256:a362e9896a95c6c3b46f3f958cf378bb818552890669d83e44cc86f2a2c4a726"
      }
   ]
}

blobs目录分析

blobs存储的内容除了各镜像layer的压缩后的文件,还包括Manifest Json和Image Json文件(未压缩)。

root@ubuntu:/home/registry/docker/registry/v2/blobs# file sha256/*/*/data
sha256/0a/0a95d0866cb574d513d22e62014898de8c0108df9c1ec25a75a0dc2cbbaed16f/data: JSON data # Image Json
sha256/a3/a362e9896a95c6c3b46f3f958cf378bb818552890669d83e44cc86f2a2c4a726/data: gzip compressed data, original size modulo 2^32 2560
sha256/a3/a39c84e173f038958d338f55a9e8ee64bb6643e8ac6ae98e08ca65146e668d86/data: gzip compressed data, original size modulo 2^32 68047360
sha256/bb/bb3c3d9d84b6b39e08f9c18ddba9572d2168bf7e7eb656b7bbc1458cbe07220d/data: JSON data # Manifest Json
sha256/fb/fba52c366bc01edecaf465ef739f5230b3034d01f8fcfe16f005929b2fd0e432/data: gzip compressed data, original size modulo 2^32 2560

选一个layer的压缩文件data解压看一下内容:

root@ubuntu:/home/registry/docker/registry/v2/blobs# ls -ls sha256/a3/a362e9896a95c6c3b46f3f958cf378bb818552890669d83e44cc86f2a2c4a726/data 
4 -rw-r--r-- 1 root root 140 Oct 24 10:12 sha256/a3/a362e9896a95c6c3b46f3f958cf378bb818552890669d83e44cc86f2a2c4a726/data

root@ubuntu:/home/registry/docker/registry/v2/blobs# tar -tvf sha256/a3/a362e9896a95c6c3b46f3f958cf378bb818552890669d83e44cc86f2a2c4a726/data 
drwxr-xr-x 0/0               0 2021-10-19 15:43 home/
-rw-r--r-- 0/0               4 2021-10-19 15:33 home/two.txt

另外需要注意的是,docker pull时前面显示的值是对应层在registry的压缩文件的digest值,并不是layer diffid,size也是registry存储的压缩文件大小。

参考资料

  • https://github.com/opencontainers/image-spec/blob/main/config.md
  • https://github.com/opencontainers/image-spec/blob/main/manifest.md
  • https://programmer.group/docker-learning-image-s-local-storage-architecture.html

你可能感兴趣的:(Docker镜像存储格式分析)