Author: Sijin Yu
本文涉及的13篇论文分别是:
[1] Tao Chen, Chenhui Wang, Hongming Shan. BerDiff: Conditional Bernoulli Diffusion Model for Medical Image Segmentation. MICCAI, 2023.
[5] Xinrong Hu, Yu-Jen Chen, Tsung-Yi Ho, Yiyu Shi. Conditional Diffusion Models for Weakly Supervised Medical Image Segmentation. MICCAI, 2023.
[8] Tianxu Lv, Yuan Liu, Kai Miao, Lihua Li, Xiang Pan. Diffusion Kinetic Model for Breast Cancer Segmentation in Incomplete DCE-MRI. MICCAI, 2023.
[11] G. Jignesh Chowdary, Zhaozheng Yin. Diffusion Transformer U-Net for Medical Image Segmentation. MICCAI, 2023.
[17] Xinyi Yu, Guanbin Li, Wei Lou, Siqi Liu, Xiang Wan, Yan Chen, and Haofeng Li. Diffusion-Based Data Augmentation for Nuclei Image Segmentation. MICCAI, 2023.
[20] Héctor Carrión and Narges Norouzi. FEDD - Fair, Efficient, and Diverse Diffusion-Based Lesion Segmentation and Malignancy Classification. MICCAI, 2023.
[22] Mengxue Sun, Wenhui Huang , and Yuanjie Zheng. Instance-Aware Diffusion Model for Gland Segmentation in Colon Histology Images. MICCAI, 2023.
[23] Jianfeng Zhao and Shuo Li. Learning Reliability of Multi-modality Medical Images for Tumor Segmentation via Evidence-Identified Denoising Diffusion Probabilistic Models. MICCAI, 2023.
[27] Jiacheng Wang, Jing Yang, Qichao Zhou, Liansheng Wang. Medical Boundary Diffusion Model for Skin Lesion Segmentation. MICCAI, 2023.
[30] Junde Wu, Rao Fu, Huihui Fang, Yu Zhang, Yehui Yang, Haoyi Xiong, Huiying Liu, and Yanwu Xu. MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model. MIDL, 2023.
[33] Junde Wu, Rao Fu, Huihui Fang, Yu Zhang, and Yanwu Xu. MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer. arXiv preprint arXiv:2301.11798, 2023.
[35] Boah Kim, Yujin Oh, Jong Chul Ye. Diffusion Adversarial Representation Learning for Self-supervised Vessel Segmentation. ICLR, 2023.
[36] Jiarui Xu, Sifei Liu, Arash Vahdat, Wonmin Byeon, Xiaolong Wang, Shalini De Mello.Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models. CVPR, 2023.
[1] Tao Chen, Chenhui Wang, Hongming Shan. BerDiff: Conditional Bernoulli Diffusion Model for Medical Image Segmentation. MICCAI, 2023.
x ∈ R H × W × C x\in\mathbb R^{H\times W\times C} x∈RH×W×C 表示输入的图像. H × W H\times W H×W 表示分辨率. C C C 表示通道数.
ground-truth mask 表示为 y 0 ∈ { 0 , 1 } H × W y_0\in\{0,1\}^{H\times W} y0∈{0,1}H×W. 0 0 0 表示背景, 1 1 1 表示 ROI.
伯努利扩散模型 (Bernoulli Diffusion Model) 可表示为:
p θ ( y 0 ∣ x ) = ∫ p θ ( y 0 : T ) d y 1 : T p_{\theta}(y_0|x)=\int p_{\theta}(y_{0:T})\text dy_{1:T} pθ(y0∣x)=∫pθ(y0:T)dy1:T
其中, 初始化的伯努利噪音 (Bernoulli Noise) 为
y T ∼ B ( y T ; 1 2 ⋅ 1 ) y_T\sim\mathcal B(y_T;\frac12\cdot\textbf1) yT∼B(yT;21⋅1)
伯努利前向过程 (Bernoulli Forward Process) 是一个马尔可夫链 (Markov Chain), 表示如下:
q ( y 1 : T ∣ y 0 ) : = ∏ t = 1 T q ( y t ∣ y t − 1 ) q(y_{1:T}|y_0):=\prod_{t=1}^T q(y_t|y_{t-1}) q(y1:T∣y0):=t=1∏Tq(yt∣yt−1)
q ( y t ∣ y t − 1 ) : = B ( y t ; ( 1 − β t ) y t − 1 + β t / 2 ) q(y_t|y_{t-1}):=\mathcal B(y_t;(1-\beta_t)y_{t-1}+\beta_t/2) q(yt∣yt−1):=B(yt;(1−βt)yt−1+βt/2)
令 α t = 1 − β t \alpha_t = 1-\beta_t αt=1−βt, 和 α ˉ t = ∏ τ = 1 t α τ \bar \alpha_t=\prod_{\tau=1}^t\alpha_{\tau} αˉt=∏τ=1tατ, 可以得到任意时间步 t t t 的样本 y t y_t yt 的分布为:
q ( y t ∣ y 0 ) = B ( y t ; α ˉ t y 0 + ( 1 − α ˉ t ) / 2 ) q(y_t|y_0)=\mathcal B(y_t;\bar\alpha_t y_0+(1-\bar\alpha_t)/2) q(yt∣y0)=B(yt;αˉty0+(1−αˉt)/2)
为了保证目标函数可导, 采样 ϵ ∼ B ( ϵ ; 1 − α ˉ 2 ⋅ 1 ) \epsilon\sim\mathcal B(\epsilon;\frac{1-\bar\alpha}{2}\cdot\textbf 1) ϵ∼B(ϵ;21−αˉ⋅1), 并且令 y t = y 0 ⊗ ϵ y_t=y_0\otimes\epsilon yt=y0⊗ϵ, 其中 ⊗ \otimes ⊗ 为异或 (XOR) 操作. 请注意这是等价的.
伯努利后验概率 (Bernoulli Posterior Probability) 可以被表示为:
q ( y t − 1 ∣ y t , y 0 ) = B ( y t − 1 ; θ p o s t ( y t , y 0 ) ) q(y_{t-1}|y_t,y_0)=\mathcal B(y_{t-1};\theta_{post}(y_t, y_0)) q(yt−1∣yt,y0)=B(yt−1;θpost(yt,y0))
其中, θ p o s t ( ⋅ ) \theta_{post}(\cdot) θpost(⋅) 为:
θ p o s t ( y t , y 0 ) = N o r m ( [ α t [ 1 − y t , y t ] + 1 − α t 2 ] ⊙ α ˉ t − 1 [ 1 − y 0 , y 0 ] + 1 − α ˉ t − 1 2 ) \theta_{post}(y_t,y_0)=Norm\left([\alpha_t[1-y_t,y_t]+\frac{1-\alpha_t}2]\odot\bar\alpha_{t-1}[1-y_0,y_0]+\frac{1-\bar \alpha_{t-1}}{2}\right) θpost(yt,y0)=Norm([αt[1−yt,yt]+21−αt]⊙αˉt−1[1−y0,y0]+21−αˉt−1)
其中, ⊙ \odot ⊙ 表示各元素相乘, N o r m ( ⋅ ) Norm(\cdot) Norm(⋅) 表示在 channel 维度上做归一化.
伯努利反向过程 (Bernoulli Reverse Process) 可以表示为:
p θ ( y 0 : T ∣ x ) : = p ( y T ) ∏ t = 1 T p θ ( y t − 1 ∣ y t , x ) p_{\theta}(y_{0:T}|x):=p(y_T)\prod_{t=1}^Tp_{\theta}(y_{t-1}|y_t,x) pθ(y0:T∣x):=p(yT)t=1∏Tpθ(yt−1∣yt,x)
p θ ( y t − 1 ∣ y t , x ) : = B ( y t − 1 ; μ ^ ( y t , t , x ) ) p_{\theta}(y_{t-1}|y_t, x):=\mathcal B(y_{t-1};\hat\mu(y_t,t, x)) pθ(yt−1∣yt,x):=B(yt−1;μ^(yt,t,x))
其中, 使用 y t y_t yt 的估计伯努利噪音 ϵ ^ ( y t , t , x ) \hat\epsilon(y_t, t, x) ϵ^(yt,t,x), 通过一个标函数 F C \mathcal F_C FC 来参数化 y t y_t yt 的估计均值 μ ^ ( y t , t , x ) \hat\mu(y_t,t, x) μ^(yt,t,x), 即
μ ^ ( y t , t , x ) = F C ( y t , ϵ ^ ( y t , t , x ) ) = θ p o s t ( y t , ∣ y t − ϵ ^ ( y t , t , x ∣ ) \hat\mu(y_t, t, x)=\mathcal F_C(y_t,\hat\epsilon(y_t, t, x))=\theta_{post}(y_t,|y_t-\hat\epsilon(y_t, t, x|) μ^(yt,t,x)=FC(yt,ϵ^(yt,t,x))=θpost(yt,∣yt−ϵ^(yt,t,x∣)
KL 损失, 用于拉近后验概率和反向过程的距离
L K L = E q ( x , y 0 ) E q ( t t ∣ y 0 ) [ D K L [ q ( y t − 1 ∣ y t , y 0 ) ∣ ∣ p θ ( y t − 1 ∣ y t , x ) ] ] \mathcal L_{KL}=\mathbb E_{q(x, y_0)}\mathbb E_{q(t_t|y_0)} \left[ D_{KL}[q(y_{t-1}|y_t, y_0)||p_\theta(y_{t-1}|y_t, x)] \right] LKL=Eq(x,y0)Eq(tt∣y0)[DKL[q(yt−1∣yt,y0)∣∣pθ(yt−1∣yt,x)]]
二分类交叉熵损失, 用于拉近噪音估计和标准伯努利分布的距离
L B C E = − E ( ϵ , ϵ ^ ) ∑ ( i , j ) H , W [ ϵ ( i , j ) log ϵ ^ ( i , j ) + ( 1 − ϵ i , j ) log ( 1 − ϵ ^ i , j ) ] \mathcal L_{BCE}=-\mathbb E_{(\epsilon,\hat\epsilon)}\sum_{(i, j)}^{H,W}[ \epsilon_{(i, j)}\log\hat\epsilon_{(i, j)}+(1-\epsilon_{i, j})\log(1-\hat\epsilon_{i,j}) ] LBCE=−E(ϵ,ϵ^)(i,j)∑H,W[ϵ(i,j)logϵ^(i,j)+(1−ϵi,j)log(1−ϵ^i,j)]
最后,
L t o t a l = L K L + λ B C E L B C E \mathcal L_{total} = \mathcal L_{KL}+\lambda_{BCE}\mathcal L_{BCE} Ltotal=LKL+λBCELBCE
表 1 展示了不同损失函数和目标函数的影响.
表 2 展现了使用高斯噪音和伯努利噪音的影响.
[5] Xinrong Hu, Yu-Jen Chen, Tsung-Yi Ho, Yiyu Shi. Conditional Diffusion Models for Weakly Supervised Medical Image Segmentation. MICCAI, 2023.
设在分布 D ( x ∣ y ) D(x|y) D(x∣y) 中采样了一个样本 x 0 x_0 x0. x 0 x_0 x0 表示一个图像.
y y y 是条件, 它可以是各种各样的, 可以包括图像不同的模态、风格、分辨率. 在这份工作中, y ∈ { y 0 , y 1 } y\in\{y_0, y_1\} y∈{y0,y1} 表示图像的二分类标签 (例如脑部 CT 扫描里的是否有肿瘤).
其中, y y y 的输入是一个可学习的 embedding e = f ( y ) , f ∈ R → R n e=f(y), f\in\mathbb R\to\mathbb R^n e=f(y),f∈R→Rn
前向过程可以表示为马尔可夫链:
q ( x t ∣ x t − 1 , y ) : = N ( x t ∣ y ; 1 − β t x t − 1 ∣ y , β t ⋅ 1 ) q(x_t|x_{t-1},y):=\mathcal N(x_t|y;\sqrt{1-\beta_t}x_{t-1}|y,\beta_t\cdot\textbf 1) q(xt∣xt−1,y):=N(xt∣y;1−βtxt−1∣y,βt⋅1)
令 α t : = 1 − β t \alpha_t:=1-\beta_t αt:=1−βt 和 α ˉ t : = ∏ τ = 1 t α τ \bar\alpha_t:=\prod_{\tau=1}^t\alpha_{\tau} αˉt:=∏τ=1tατ, 给定 x 0 x_0 x0, 可以直接得到 x t x_t xt:
q ( x t ∣ x 0 , y ) : = N ( x t ∣ y ; α ˉ t x 0 ∣ y , ( 1 − α ˉ t ) ⋅ 1 ) q(x_t|x_0,y):=\mathcal N(x_t|y;\sqrt{\bar\alpha_t}x_0|y,(1-\bar\alpha_t)\cdot\textbf 1) q(xt∣x0,y):=N(xt∣y;αˉtx0∣y,(1−αˉt)⋅1)
训练一个 U-Net ϵ θ ( x , t , y ) \epsilon_{\theta}(x,t,y) ϵθ(x,t,y) 去近似反向过程:
p θ ( x t − 1 ∣ x t , y ) : = N ( x t − 1 ; μ θ ( x t , t , y ) , Σ θ ( x t , t , y ) ) p_{\theta}(x_{t-1}|x_t,y):=\mathcal N(x_{t-1};\mu_{\theta}(x_t, t, y),\Sigma_{\theta}(x_t, t, y)) pθ(xt−1∣xt,y):=N(xt−1;μθ(xt,t,y),Σθ(xt,t,y))
将 x t x_t xt 用标准高斯分布 ϵ ∼ N ( 0 , 1 ) \epsilon\sim\mathcal N(\textbf 0,\textbf 1) ϵ∼N(0,1) 重参数化:
x t = α ˉ t x 0 + 1 − α ˉ t ϵ x_t=\sqrt{\bar\alpha_t}x_0 + \sqrt{1-\bar\alpha_t}\epsilon xt=αˉtx0+1−αˉtϵ
损失函数为:
L : = E x 0 , ϵ ∣ ∣ ϵ − ϵ θ ( x t , t , y ) ∣ ∣ L:=\mathbb E_{x_0, \epsilon} || \epsilon-\epsilon_{\theta}(x_t, t, y) || L:=Ex0,ϵ∣∣ϵ−ϵθ(xt,t,y)∣∣
通过 x t x_{t} xt 复原 x t − 1 x_{t-1} xt−1 可以通过:
x t − 1 ( x t , t , y ) = α ˉ t − 1 ( x t − 1 − α ˉ t ϵ ^ ( x t , y ) α ˉ t ) + 1 − α ˉ t − 1 ϵ ^ θ ( x t , y ) x_{t-1}(x_t, t, y)=\sqrt{\bar\alpha_{t-1}}\left( \frac{x_t-\sqrt{1-\bar\alpha_t}\hat\epsilon(x_t, y)}{\sqrt{\bar\alpha_t}} \right)+\sqrt{1-\bar\alpha_{t-1}}\hat\epsilon_{\theta}(x_t, y) xt−1(xt,t,y)=αˉt−1(αˉtxt−1−αˉtϵ^(xt,y))+1−αˉt−1ϵ^θ(xt,y)
x t − 1 ( x t , t , y ) x_{t-1}(x_t, t, y) xt−1(xt,t,y) 对 y y y 的偏微分 ∂ x t − 1 ∂ y \frac{\partial x_{t-1}}{\partial y} ∂y∂xt−1 可以通过下式计算
∂ x t − 1 ( x t , t , y ) ∂ y ∣ y = y 1 = lim τ → 1 x t − 1 ( x t , t , f ( y 1 ) ) − x t − 1 ( x t , t , τ f ( y 1 ) + ( 1 − τ ) f ( y 0 ) ) 1 − τ \left.\frac{\partial x_{t-1}(x_t, t, y)}{\partial y}\right|_{y=y_1}=\lim_{\tau\to1}\frac{x_{t-1}(x_t, t, f(y_1))-x_{t-1}(x_t, t, \tau f(y_1)+(1-\tau)f(y_0))}{1-\tau} ∂y∂xt−1(xt,t,y) y=y1=τ→1lim1−τxt−1(xt,t,f(y1))−xt−1(xt,t,τf(y1)+(1−τ)f(y0))
在实验中, 取 τ = 0.95 \tau=0.95 τ=0.95.
表 1 是在 BraTS 上的横向对比, 表 2 是在 CHAOS 上的横向对比.
图为主要超参数的消融实验结果.
[8] Tianxu Lv, Yuan Liu, Kai Miao, Lihua Li, Xiang Pan. Diffusion Kinetic Model for Breast Cancer Segmentation in Incomplete DCE-MRI. MICCAI, 2023.
什么是 DCE-MRI, 以及什么是 pre-contrast images 和 post-contrast images?
ChatGPT: 在动态对比增强磁共振成像(DCE-MRI)中,“pre-contrast images” 和 “post-contrast images” 指的是MRI扫描中的不同图像获取阶段,特别是在使用对比剂的情况下。
- 前对比图像(Pre-Contrast Images): 这些图像是在注射对比剂之前获取的。前对比图像用作基准,并通常用于解剖参考,以提供与对比剂注射后获取的图像进行比较的基准点。它们有助于评估对比增强前组织的特征。
- 后对比图像(Post-Contrast Images): 这些图像是在将对比剂注入患者的血液后获取的。对比剂突出显示了血管和各种组织内的血流分布。后对比图像显示了对比剂在组织内的摄取和分布方式,并有助于评估灌注、血管化和病变或异常的存在。这些图像用于跟踪组织特征的变化,评估对比增强的影响。
DCE-MRI通常用于医学影像学,用于评估各种疾病,包括肿瘤特征化、血管化评估和组织灌注的评估。前对比和后对比图像在提供有关诊断和治疗规划的宝贵信息方面发挥着关键作用。
上图左边表示血液动力图响应函数和 Denoising Diffusion Process 之间的关系, 右边表示体素级别的强度随时间的变化.
[⚠️注: 这里, x 0 , x 1 , ⋯ , x k x_0, x_1, \cdots,x_k x0,x1,⋯,xk 是 DCE-MRI 扫描的图片的过程, x 0 → x t x_0\to x_t x0→xt 是扩散模型的前向过程, x t → x t − 1 → ⋯ → x k x_t\to x_{t-1}\to\cdots\to x_k xt→xt−1→⋯→xk 是扩散模型的反向过程, 即扩散模型的任务是: 给定前对比图像 x 0 x_0 x0, 生成后对比图像 x k x_k xk.]
模型分为 Diffusion Module (图中 a) 和 Segmentation Module (图中 b).
Diffusion Module 为经典的 DDPM, 以前对比图像 x 0 x_0 x0 到高斯噪音 x t x_t xt 为前向过程, 以高斯噪音 x t x_t xt 到 x 0 x_0 x0 为反向过程. DM 被预训练后, DM 中间的隐藏输出 f d m f_{dm} fdm 就包含了血液动力学响应函数的信息.
Segmentation Module 由前四层 KineticBlock 和后四层 UpBlock 组成.
KineticBlock 同时以 DM 的隐藏输出 f d m f_{dm} fdm 和上一层 KineticBlock 的输出 f s m f_{sm} fsm 为输入, 通过一个 Fusion Layer 融合它们:
f ^ = F u s i o n ( f d m , f s m ) = C o n c a t ( R e L U ( B N ( W ∗ f d m ) ) ; f s m ) \hat f=Fusion(f_{dm}, f_{sm})= Concat(ReLU(BN(W*f_{dm}));f_{sm}) f^=Fusion(fdm,fsm)=Concat(ReLU(BN(W∗fdm));fsm)
f i f_i fi 表示 DM 的第 i i i 阶段的 feature map.
[11] G. Jignesh Chowdary, Zhaozheng Yin. Diffusion Transformer U-Net for Medical Image Segmentation. MICCAI, 2023.
最后的输入被 reshape 成和 f M f_M fM 一样的形状.
U-Net 的组成是 Multi-sized Transformer.
input 先通过 Multi-sized window 的 Transformer, 一共有 K K K 条路, 所有路的加和进入 Shifted window, 得到 output.
[17] Xinyi Yu, Guanbin Li, Wei Lou, Siqi Liu, Xiang Wan, Yan Chen, and Haofeng Li. Diffusion-Based Data Augmentation for Nuclei Image Segmentation. MICCAI, 2023.
生成模型由两个步骤组成:
Nuclei Structure 由 pixel-level semantic (像素级别语义) 和 distance transform (距离变换) 两部分组成.
因此, 一个 Nuclei Structure 是具有三个通道的, 和原始图像一样大的图像.
[20] Héctor Carrión and Narges Norouzi. FEDD - Fair, Efficient, and Diverse Diffusion-Based Lesion Segmentation and Malignancy Classification. MICCAI, 2023.
在 DM 中的 U-Net 中指定的一层获得 embedding, 它通过上采样以进行分割, 通过下采样以进行分类.
[22] Mengxue Sun, Wenhui Huang , and Yuanjie Zheng. Instance-Aware Diffusion Model for Gland Segmentation in Colon Histology Images. MICCAI, 2023.
[23] Jianfeng Zhao and Shuo Li. Learning Reliability of Multi-modality Medical Images for Tumor Segmentation via Evidence-Identified Denoising Diffusion Probabilistic Models. MICCAI, 2023.
EI-DDPM 模型由三个部分组成:
DDPM 用于生成分割图, 以某一模态的图像为条件.
使用下文中的方法:
https://blog.csdn.net/yusijinfs/article/details/134427358
将 T1, T2, Flair, T1ce 四种模态的分割结果做融合.
[27] Jiacheng Wang, Jing Yang, Qichao Zhou, Liansheng Wang. Medical Boundary Diffusion Model for Skin Lesion Segmentation. MICCAI, 2023.
DM 的模型参数被固定.
不同的分割结果是因为不用的高斯噪音初始化样本造成的.
令 y 0 y_0 y0 表示分割图 groundtruth.
DM 的初始噪音为 y T ∗ ∼ N ( 0 , I ) y_T^*\sim\mathcal N(0, \mathbf I) yT∗∼N(0,I).
假设对一个图像进行 n n n 次分割, 则第 i i i 次的初始噪音为 y T ∗ , i y_T^{*,i} yT∗,i.
对 n n n 个初始化噪音 { y T ∗ , i } i = 1 n \{y_T^{*,i}\}_{i=1}^n {yT∗,i}i=1n, 都跑 DM, 则均值 { μ ∗ , i } i = 1 n \{\mu^{*, i}\}_{i=1}^n {μ∗,i}i=1n 和方差 { Σ ∗ , i } i = 1 n \{\Sigma^{*,i}\}_{i=1}^n {Σ∗,i}i=1n 为 DM 得到的结果.
第 i i i 个分割图计算如此计算: y ∗ , i = μ ∗ , i + exp ( 1 2 Σ ∗ , i ) N ( 0 , I ) y^{*,i}=\mu^{*,i}+\exp(\frac12\Sigma^{*,i})\mathcal N(0, \mathbf I) y∗,i=μ∗,i+exp(21Σ∗,i)N(0,I).
不确定性如此计算:
δ = 1 n ∑ i = 1 n ( μ ∗ , i − 1 n ∑ j = 1 n μ ∗ , j ) 2 \delta=\sqrt{\frac{1}{n}\sum^n_{i=1}\left(\mu^{*,i}-\frac{1}{n}\sum_{j=1}^{n}\mu^{*,j}\right)^2} δ=n1i=1∑n(μ∗,i−n1j=1∑nμ∗,j)2
分割图由最大值投票得来: y ∗ = ( ∑ i = 1 n y ∗ , i ) ≥ τ y^*=(\sum_{i=1}^ny^{*,i})\geq \tau y∗=(∑i=1ny∗,i)≥τ. 其中, τ \tau τ 是投票阈值.
[30] Junde Wu, Rao Fu, Huihui Fang, Yu Zhang, Yehui Yang, Haoyi Xiong, Huiying Liu, and Yanwu Xu. MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model. MIDL, 2023.
扩散模型用于分割的生成.
原始图像用于扩散的条件.
Attention 机制的描述如下:
A ( m I k , m x k ) = ( L N ( m I k ) ⊗ L N ( m x k ) ) ⊗ m I k \mathcal A(m_I^k, m_x^k)=(LN(m_I^k)\otimes LN(m_x^k))\otimes m_I^k A(mIk,mxk)=(LN(mIk)⊗LN(mxk))⊗mIk
给定特征图为 m ∈ R H × W × C m\in \mathbb R^{H\times W\times C} m∈RH×W×C.
计算其 2D 快速傅立叶变换 (Fast Fourier Transform, FFT):
M = F [ m ] ∈ C H × W × C M=\mathcal F[m]\in\mathbb C^{H\times W\times C} M=F[m]∈CH×W×C
学习一个频域注意图 (Attentive Map):
M ′ = A ⊗ M M'=A\otimes M M′=A⊗M
m ′ = F − 1 [ M ′ ] m'=\mathcal F^{-1}[M'] m′=F−1[M′]
[33] Junde Wu, Rao Fu, Huihui Fang, Yu Zhang, and Yanwu Xu. MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer. arXiv preprint arXiv:2301.11798, 2023.
医学图像分割对诊断和手术规划至关重要, 需要改进一致性和准确性,自动化方法可以提供这些改进.
深度学习已经推动了医学图像分割的进步, 但将新型模型如扩散概率模型 (DPM) 与现有方法整合仍然充满挑战.
存在一种需求, 即需要弥合基于 Transformer 的模型与 DPM 在有效医学图像分割中的差距.
[35] Boah Kim, Yujin Oh, Jong Chul Ye. Diffusion Adversarial Representation Learning for Self-supervised Vessel Segmentation. ICLR, 2023.
在医学图像中的血管分割是血管疾病诊断和治疗规划中的重要任务之一.
虽然基于学习的分割方法已经被广泛研究, 但在监督方法中需要大量真实标签, 而且混乱的背景结构使得神经网络在非监督方式下难以分割血管.
为了解决这个问题, 我们在这里引入了一种新颖的扩散对抗表示学习(Diffusion Adversarial Representation Learning, DARL) 模型, 该模型结合了去噪扩散概率模型和对抗学习, 并将其应用于血管分割.
特别是对于自监督的血管分割, DARL 通过一个扩散模块学习背景信号, 这使得生成模块能够有效地提供血管表示.
此外, 通过基于所提出的可切换空间自适应去归一化 (Switchable Spatial- Adaptive Denormalization) 的对抗学习, 我们的模型估计合成的假血管图像以及血管分割掩码, 这进一步使模型捕获与血管相关的语义信息.
一旦所提出的模型被训练, 它可以在单个步骤中生成分割掩码, 并且可以应用于冠状动脉造影和视网膜图像的一般血管结构分割.
在各种数据集上的实验结果表明, 我们的方法显著优于现有的非监督和自监督血管分割方法.
一组数据由两个图像组成, x 0 a x_0^a x0a 为血管造影 (angiography), x 0 b x_0^b x0b 为背景 (background). 在数据采集时, 先采集 x 0 b x_0^b x0b, 然后给患者注射对比剂, 然后采集到 x 0 a x_0^a x0a, 在这个过程中可能由于患者的移动导致两张图不对齐.
生成模块由 N N N 个 ResnetBlock 组成. 每个 ResnetBlock 的计算是可切换的 (计算取决于当前是路径 A 还是 B). 令特征图为 v ∈ R B × C × H × W v\in\mathbb R^{B\times C\times H\times W} v∈RB×C×H×W, B , C , H , W B, C, H, W B,C,H,W 分别为批量大小, 通道数, 高, 宽. 在可切换层的计算如下:
当计算路径 A 时, 即不输入 mask 图 s s s:
v = IN ( v ) v=\text{IN}(v) v=IN(v)
其中 IN ( ⋅ ) \text{IN}(\cdot) IN(⋅) 是 instance normalization.
当计算路径 B 时, 即输入 mask 图 s s s:
v = SPADE ( v , s ) v=\text{SPADE}(v, s) v=SPADE(v,s)
其中, SPADE ( ⋅ , ⋅ ) \text{SPADE}(\cdot,\cdot) SPADE(⋅,⋅) 的定义为:
v b , c , h , w = γ c , h , w ( s f ) v b , c , h , w − μ c σ c + β c , h , w ( s f ) v_{b,c,h,w}=\gamma_{c, h, w}(s^f)\frac{v_{b,c,h,w}-\mu_c}{\sigma_c}+\beta_{c, h, w}(s^f) vb,c,h,w=γc,h,w(sf)σcvb,c,h,w−μc+βc,h,w(sf)
最后, 模型的生成方式为:
路径 A: 给定加噪的血管造影 x t a a x^a_{t_a} xtaa, 使用扩散模块计算 latent space ϵ θ ( x t a a , t a ) \epsilon_\theta (x_{t_a}^a, t_a) ϵθ(xtaa,ta), 生成模块 G G G 生成分割 mask s ^ v \hat s^v s^v:
s ^ v = G ( ϵ θ ( x t a a , t a ) ; 0 ) \hat s^v=G(\epsilon_\theta(x_{t_a}^a,t_a);0) s^v=G(ϵθ(xtaa,ta);0)
路径 B: 给定加噪的背景 x t b b x_{t_b}^b xtbb, 使用扩散模块计算 latent space ϵ θ ( x t b b , t b ) \epsilon_\theta(x_{t_b}^b,t_b) ϵθ(xtbb,tb), 加上分割前景 s f s^f sf, 生成模块 G G G 生成血管造影 x ^ a \hat x^a x^a:
x ^ a = G ( ϵ θ ( x t b b , t b ) ; s f ) \hat x^a=G(\epsilon_\theta(x_{t_b}^b, t_b);s^f) x^a=G(ϵθ(xtbb,tb);sf)
对于训练的描述如上图所示, 用到了三个损失函数 L a d v , L d i f f , L c y c \mathcal L_{adv}, \mathcal L_{diff}, \mathcal L_{cyc} Ladv,Ldiff,Lcyc.
Adversarial loss 对抗损失 L a d v \mathcal L_{adv} Ladv
这一损失的目的是同时训练生成器和鉴别器.
用于生成器的训练:
L a d v G ( ϵ θ , G , D s , D a ) = E x a [ ( D s ( G ( ϵ θ ( x a ) ; 0 ) ) − 1 ) 2 ] + E x a , s f [ ( D s ( G ( ϵ θ ( x a ) ; s f ) ) − 1 ) 2 ] \mathcal L_{adv}^G(\epsilon_\theta, G, D_s, D_a)=\mathbb E_{x^a}[(D_s(G(\epsilon_\theta(x^a); 0))-1)^2] + \mathbb E_{x^a, s^f}[(D_s(G(\epsilon_\theta(x^a); s^f))-1)^2] LadvG(ϵθ,G,Ds,Da)=Exa[(Ds(G(ϵθ(xa);0))−1)2]+Exa,sf[(Ds(G(ϵθ(xa);sf))−1)2]
用于鉴别器的训练:
L a d v D s ( ϵ θ , G , D s ) = 1 2 E s f [ ( D s ( s f ) − 1 ) 2 ] + 1 2 E x a [ ( D s ( G ( ϵ θ ( x a ) ; 0 ) ) 2 ] \mathcal L_{adv}^{D_s}(\epsilon_\theta, G, D_s)=\frac12\mathbb E_{s^f}[(D_s(s^f)-1)^2]+\frac12\mathbb E_{x^a}[(D_s(G(\epsilon_\theta(x^a);0))^2] LadvDs(ϵθ,G,Ds)=21Esf[(Ds(sf)−1)2]+21Exa[(Ds(G(ϵθ(xa);0))2]
L a d v D a ( ϵ θ , G , D a ) = 1 2 E x 0 a [ ( D a ( x 0 a ) − 1 ) 2 ] + 1 2 E x b , s f [ ( D a ( G ( ϵ θ ( x b ) ; s f ) ) 2 ] \mathcal L_{adv}^{D_a}(\epsilon_\theta, G, D_a)=\frac12\mathbb E_{x^a_0}[(D_a(x_0^a)-1)^2]+\frac12\mathbb E_{x^b,s^f}[(D_a(G(\epsilon_\theta(x^b);s^f))^2] LadvDa(ϵθ,G,Da)=21Ex0a[(Da(x0a)−1)2]+21Exb,sf[(Da(G(ϵθ(xb);sf))2]
Diffusion loss 扩散损失 L d i f f \mathcal L_{diff} Ldiff
这一损失的目的是训练扩散模型
L d i f f ( ϵ θ ) = E t , x 0 , ϵ [ ∣ ∣ ϵ − ϵ θ ( α t x 0 + 1 − α t ϵ , t ) ∣ ∣ 2 ] \mathcal L_{diff}(\epsilon_\theta)=\mathbb E_{t, x_0, \epsilon}[||\epsilon-\epsilon_\theta(\sqrt{\alpha_t}x_0+\sqrt{1-\alpha_t}\epsilon, t)||^2] Ldiff(ϵθ)=Et,x0,ϵ[∣∣ϵ−ϵθ(αtx0+1−αtϵ,t)∣∣2]
Cyclic reconstruction loss 循环重建损失 L c i c \mathcal L_{cic} Lcic
这是保证使用 s f s^f sf 生成的 x ^ a \hat x^a x^a 再拿去生成 s ^ f \hat s^f s^f, 两者重建应当一致.
L c y c ( ϵ θ , G ) = E x b , s f [ ∣ ∣ G ( ϵ θ ( G ( ϵ θ ( x b ) ; s f ) ) ; 0 ) − s f ∣ ∣ ] \mathcal L_{cyc}(\epsilon_\theta, G)=\mathbb E_{x_b, s^f}[||G(\epsilon_\theta(G(\epsilon_\theta(x^b);s^f));0)-s^f||] Lcyc(ϵθ,G)=Exb,sf[∣∣G(ϵθ(G(ϵθ(xb);sf));0)−sf∣∣]
最后, 总的损失有两个:
总扩散/生成损失:
L G ( ϵ θ , G , D s , D a ) = L d i f f ( ϵ θ ) + α L a d v G ( ϵ θ , G , D s , D a ) + β L c y c ( ϵ θ , G ) \mathcal L^G(\epsilon_\theta, G, D_s, D_a)=\mathcal L_{diff}(\epsilon_\theta)+\alpha \mathcal L_{adv}^G(\epsilon_\theta, G, D_s, D_a)+\beta \mathcal L_{cyc}(\epsilon_\theta, G) LG(ϵθ,G,Ds,Da)=Ldiff(ϵθ)+αLadvG(ϵθ,G,Ds,Da)+βLcyc(ϵθ,G)
总鉴别损失:
L D ( ϵ θ , G , D s , D a ) = L a d v D a ( ϵ θ , G , D a ) + L a d v D s ( ϵ θ , G , D s ) \mathcal L^D(\epsilon_\theta, G, D_s, D_a)=\mathcal L_{adv}^{D_a}(\epsilon_\theta, G, D_a)+\mathcal L_{adv}^{D_s}(\epsilon_\theta, G, D_s) LD(ϵθ,G,Ds,Da)=LadvDa(ϵθ,G,Da)+LadvDs(ϵθ,G,Ds)
[36] Jiarui Xu, Sifei Liu, Arash Vahdat, Wonmin Byeon, Xiaolong Wang, Shalini De Mello.Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models. CVPR, 2023.
训练:
测试:
[1] Tao Chen, Chenhui Wang, Hongming Shan. BerDiff: Conditional Bernoulli Diffusion Model for Medical Image Segmentation. MICCAI, 2023.
[2] Armato III, S.G., McLennan, G., Bidaut, L., McNitt-Gray, M.F., Meyer, C.R., Reeves, A.P., Zhao, B., Aberle, D.R., Henschke, C.I., Hoffman, E.A., et al. The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans. Medical physics, 2011.
[3] Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., et al. The cancer imaging archive (TCIA): maintaining and operating a public information repository. Journal of digital imaging, 2013.
[4] Baid, U., Ghodasara, S., Mohan, S., Bilello, M., Calabrese, E., Colak, E., Fara- hani, K., Kalpathy-Cramer, J., Kitamura, F.C., Pati, S., et al. The RSNA-ASNR- MICCAI BraTS 2021 benchmark on brain tumor segmentation and radiogenomic classification. arXiv:2107.02314, 2021.
[5] Xinrong Hu, Yu-Jen Chen, Tsung-Yi Ho, Yiyu Shi. Conditional Diffusion Models for Weakly Supervised Medical Image Segmentation. MICCAI, 2023.
[6] Bakas, S., Akbari, H., Sotiras, A., Bilello, M., Rozycki, M., Kirby, J.S., Freymann, J.B., Farahani, K., Davatzikos, C. Advancing the cancer genome atlas glioma mri collections with expert segmentation labels and radiomic features. Scientific data, 2017.
[7] Kavur, A.E., Gezer, N.S., Barı ̧s, M., Aslan, S., Conze, P.H., Groza, V., Pham, D.D., Chatterjee, S., Ernst, P., O ̈zkan, S., Baydar, B., Lachinov, D., Han, S., Pauli, J., Isensee, F., Perkonigg, M., Sathish, R., Rajan, R., Sheet, D., Dovletov, G., Speck, O., Nu ̈rnberger, A., Maier-Hein, K.H., Bozdag ̆ı Akar, G., U ̈nal, G., Dicle, O., Selver, M.A. CHAOS Challenge - combined (CT-MR) healthy abdominal organ segmentation. Medical Image Analysis, 2021.
[8] Tianxu Lv, Yuan Liu, Kai Miao, Lihua Li, Xiang Pan. Diffusion Kinetic Model for Breast Cancer Segmentation in Incomplete DCE-MRI. MICCAI, 2023.
[9] Newitt, D., Hylton, N. Single site breast DCE-MRI data and segmentations from patients undergoing neoadjuvant chemotherapy. Cancer Imaging Arch, 2016.
[10] Hyun-Jic Oh, Won-Ki Jeong. DiffMix: Diffusion Model-Based Data Synthesis for Nuclei Segmentation and Classification in Imbalanced Pathology Image Datasets. MICCAI, 2023.
[11] G. Jignesh Chowdary, Zhaozheng Yin. Diffusion Transformer U-Net for Medical Image Segmentation. MICCAI, 2023.
[12] Jha, D., et al. Kvasir-SEG: a segmented polyp dataset. Springer, Cham, 2020.
[13] Bernal, J., S ́anchez, F.J., Fern ́andez-Esparrach, G., Gil, D., Rodr ́ıguez, C., Vilarin ̃o, F. Wm-dova maps for accurate polyp highlighting in colonoscopy: valida- tion vs. saliency maps from physicians. Comput. Med. Imaging Graph, 2015.
[14] Codella, N.C., et al. Skin lesion analysis toward melanoma detection: a challenge at the 2017 international symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (isic). ISBI, 2018.
[15] Tschandl, P., Rosendahl, C., Kittler, H. The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific data, 2018.
[16] Orlando, J.I., et al. Refuge challenge: a unified framework for evaluating automated methods for glaucoma assessment from fundus photographs. Med. Image Anal, 2020.
[17] Xinyi Yu, Guanbin Li, Wei Lou, Siqi Liu, Xiang Wan, Yan Chen, and Haofeng Li. Diffusion-Based Data Augmentation for Nuclei Image Segmentation. MICCAI, 2023.
[18] Kumar, N., et al. A multi-organ nucleus segmentation challenge. IEEE Trans. Med. Imaging, 2019.
[19] Kumar, N., Verma, R., Sharma, S., Bhargava, S., Vahadane, A., Sethi, A. A dataset and a technique for generalized nuclear segmentation for computational pathology. IEEE Trans. Med. Imaging, 2017.
[20] Héctor Carrión and Narges Norouzi. FEDD - Fair, Efficient, and Diverse Diffusion-Based Lesion Segmentation and Malignancy Classification. MICCAI, 2023.
[21] Daneshjou, R., et al. Disparities in dermatology AI performance on a diverse, curated clinical image set. Sci. Adv, 2022.
[22] Mengxue Sun, Wenhui Huang , and Yuanjie Zheng. Instance-Aware Diffusion Model for Gland Segmentation in Colon Histology Images. MICCAI, 2023.
[23] Jianfeng Zhao and Shuo Li. Learning Reliability of Multi-modality Medical Images for Tumor Segmentation via Evidence-Identified Denoising Diffusion Probabilistic Models. MICCAI, 2023.
[24] Baid, U., et al. The rsna-asnr-miccai brats 2021 benchmark on brain tumor segmentation and radiogenomic classification. arXiv preprint arXiv:2107.02314, 2021.
[25] Bakas, S., et al. Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci. Data, 2017.
[26] Menze, B.H., et al. The multimodal brain tumor image segmentation benchmark (brats). IEEE Trans. Med. Imaging, 2014.
[27] Jiacheng Wang, Jing Yang, Qichao Zhou, Liansheng Wang. Medical Boundary Diffusion Model for Skin Lesion Segmentation. MICCAI, 2023.
[28] Gutman, D., et al. Skin lesion analysis toward melanoma detection: A challenge at the international symposium on biomedical imaging (ISBI) 2016, hosted by the international skin imaging collaboration (ISIC). arXiv preprint arXiv:1605.01397, 2016.
[29] Mendonça, T., Ferreira, P.M., Marques, J.S., Marcal, A.R., Rozeira, J. PH 2-A dermoscopic image database for research and benchmarking. In: 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2013.
[30] Junde Wu, Rao Fu, Huihui Fang, Yu Zhang, Yehui Yang, Haoyi Xiong, Huiying Liu, and Yanwu Xu. MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model. MIDL, 2023.
[31] Fang, H., Li, F., Fu, H., Sun, X., Cao, X., Son, J., Yu, S., Zhang, M., Yuan, C., Bian, C., et al. Refuge2 challenge: Treasure for multi-domain learning in glaucoma assessment. arXiv preprint arXiv:2202.08994, 2022.
[32] Pedraza, L., Vargas, C., Narváez, F., Durán, O., Muñoz, E., Romero, E. An open access thyroid ultrasound image database. In: 10th International symposium on medical information processing and analysis, 2015.
[33] Junde Wu, Rao Fu, Huihui Fang, Yu Zhang, and Yanwu Xu. MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer. arXiv preprint arXiv:2301.11798, 2023.
[34] Ji, Y., Bai, H., Yang, J., Ge, C., Zhu, Y., Zhang, R., Li, Z., Zhang, L., Ma, W., Wan, X., et al. Amos: A large-scale abdominal multi-organ benchmark for versatile medical image segmentation. arXiv preprint arXiv:2206.08023, 2022.
[35] Boah Kim, Yujin Oh, Jong Chul Ye. Diffusion Adversarial Representation Learning for Self-supervised Vessel Segmentation. ICLR, 2023.