【ADNI】数据预处理(2)获取 subject slices

ADNI Series

1、【ADNI】数据预处理(1)SPM,CAT12

2、【ADNI】数据预处理(2)获取 subject slices

3、【ADNI】数据预处理(3)CNNs

4、【ADNI】数据预处理(4)Get top k slices according to CNNs

5、【ADNI】数据预处理(5)Get top k slices (pMCI_sMCI) according to CNNs

6、【ADNI】数据预处理(6)ADNI_slice_dataloader ||| show image


ADNI 是2004年由美国国家卫生研究所和国家老年问题研究所共同资助建立的研究神经阿尔茨海默症的权威数据中心。ADNI拥有更广泛的样本来源,大部分志愿者仍在持续跟踪,数据多样并且可以选择性的获取原始数据或是进行了某些标准化处理的数据。使用deep learning训练模型的之前,根据模型所需对数据做一些处理;ADNI 数据库提供多种模态的数据,比如MRI,PET等。本文主要以MRI图像为主展开介绍,包括如下要点:

=== 第一部分: 获取每个subject的切片图 ===

1)从ADNI数据库下载ADNI-825数据(都是MRI模态),包括5种类别AD, NC, sMCI, pMCI, uMCI。

2)使用SPM工具(cat-12):基于AAL模板对MRI图像进行配准,此时每个subject包括4种形态的数据:原图、灰质,白质,脑脊液,本文主要以灰质为主;至于如何使用SPM对数据进行处理,下次有时间再整理一下。

3)[ get_data_root.py ] 根据数据的模态获取825数据中所有包含 .nii 的文件路径;包括:CSF, gray_matter, white_matter, original;在“825_Subject_NC”目录下得到.txt文档(如NC_gray_matter.txt );需要复制到上一级目录,方便后续的脚本使用。

4)[ get_slices_root.m ] 在根目录下执行该脚本,分别对每个subject(.nii)的灰质图(121*145*121)进行切片(X, Y, Z 轴都依次进行切片)得到 387 slices;创建相关的文件夹(如 NC_gray_matter_Slices )并根据.nii文件分别对XYZ轴进行切片(可以选择切片的步长 stride),至此得到每个样本对应的387张切片图。

5)[ get_slice_path.py ] 在根目录下执行该脚本,获取每个slice的上级目录,并将结果存于各自对应的文档中,如“AD_gray_matter_Slices_path.txt”;样本S001对应的切片路径如下所示:

{ .\AD_gray_matter_Slices\sub1\S001\XSlice  

.\AD_gray_matter_Slices\sub1\S001\YSlice 

.\AD_gray_matter_Slices\sub1\S001\ZSlice }

6)cal_entropy_slices.m ] 在根目录下执行“cal_entropy_slices.m”脚本,分别计算每组切片图下每个slice的信息熵(entropy)并从小到大进行排序;

7)[ delete_slice_N.m ]可以根据实际情况所需,取信息熵前N大的slices(如N=32),删除其他slices;

至此,你可以获得每个subject对应的slices;本人处理根据上述流程获取了如下数据:

AD_NC_ALL_SLICE  ## 每个subject对应的全部slices(387张)
AD_NC_except_entropy_zero  ## 每个subject 除了信息熵为0以外的slices
AD_NC_keep_SliceNum_33  ## 每个 subject 的前31大信息熵对应的slices
AD_NC_Slices_keep_SliceNum_81  ## 每个 subject 前81大信息熵对应的slices
AD_NC_slice_stride_2_all_slice  ## 每隔2帧取一张切片图,即只有387/2张slices

=== 第二部分 ===

1)[ step01_specified_subject_move_to_fold.py ]根据第一部分第(5)步 [ get_slice_path.py ] 得到的路径文档,如“NC_gray_matter_Slices_path.txt”将对应的slices移动到指定的目录下,如下所示:

2)[ step02_specified_subject_get_slice_train_val_test.py ] 根据第一部分得到的slices以及相关的txt文档,按“训练数据:验证数据:测试数据”的比例随机选择subject并将切片图移动到指定的目录下;注意,来自同一个subject的slices要放在同一个目录下(train、validation、test);此时每个类别会得到3个txt文档记录subject对应的切片名,如下所示:

随机选择的样本数:train,validation,test

import random

train_percentage = 0.75
val_percentage = 0.2
test_percentage = 0.05

	AD_subject_num = 199
	NC_subject_num = 230
	train_slice_num = 0
	val_slice_num = 0
	test_slice_num = 0

	if ((label + "_subject_num") == "AD_subject_num"):
		subject_num = AD_subject_num
	elif ((label + "_subject_num") == "NC_subject_num"):
		subject_num = NC_subject_num
	else:
		print("fuck..")

	rondom_list = random.sample(range(1, subject_num+1), subject_num)

	# train: [0, train_num-1]
	print("===")
	print("{}, train_slice_path = {}".format(label, train_slice_path))
	with open(train_slice_path, "a") as train_txt:
		for i in range(train_num):
			rondom_id = rondom_list[i]
			rondom_id = "S" + str("%.3d"%rondom_id)
			
			for slice_item in slice_list:
				slice_name = slice_item.split(".")[0]
				subject_name = slice_name.split("_")[1]
				# print(subject_name)
				# print("rondom_id = {}".format(rondom_id))
				# print("subject_name = {}".format(subject_name))
				if (rondom_id == subject_name):
					# print(slice_item)
					train_txt.writelines(slice_item + "\n")
					train_slice_num = train_slice_num + 1

	# val: [train_num, train_num + val_num - 1]
	print("===")
	print("{}, val_slice_path = {}".format(label, val_slice_path))
	with open(val_slice_path, "a") as val_txt:
		for i in range(val_num):
			index = i + train_num
			rondom_id = rondom_list[index]
			rondom_id = "S" + str("%.3d"%rondom_id)
			for slice_item in slice_list:
				slice_name = slice_item.split(".")[0]
				subject_name = slice_name.split("_")[1]
				# print(subject_name)	
				if (rondom_id == subject_name):
					# print(slice_item)
					val_txt.writelines(slice_item + "\n")
					val_slice_num = val_slice_num + 1

	## test: [train_num + val_num, train_num + val_num + test_num - 1]
	print("===")
	print("{}, test_slice_path = {}".format(label, test_slice_path))
	with open(test_slice_path, "a") as test_txt:
		for i in range(test_num):
			index = i + train_num + val_num
			rondom_id = rondom_list[index]
			rondom_id = "S" + str("%.3d"%rondom_id)
			for slice_item in slice_list:
				slice_name = slice_item.split(".")[0]
				subject_name = slice_name.split("_")[1]
				# print(subject_name)	
				if (rondom_id == subject_name):
					# print(slice_item)
					test_txt.writelines(slice_item + "\n")
					test_slice_num = test_slice_num + 1

将选择的 slice id 记录在文档中:

## label = AD
ADGM_test_single_subject_data_fold_01_train_val_test_entropy_keep_SliceNum_33.txt
ADGM_train_single_subject_data_fold_01_train_val_test_entropy_keep_SliceNum_33.txt
ADGM_val_single_subject_data_fold_01_train_val_test_entropy_keep_SliceNum_33.txt

## label = NC
NCGM_test_single_subject_data_fold_01_train_val_test_entropy_keep_SliceNum_33.txt
NCGM_train_single_subject_data_fold_01_train_val_test_entropy_keep_SliceNum_33.txt
NCGM_val_single_subject_data_fold_01_train_val_test_entropy_keep_SliceNum_33.txt
GMAD14905_S067.jpg  ## slices 的名称,subject_id = S067对应的所有slices都放于同一个目录下;
GMAD14921_S067.jpg
GMAD14891_S067.jpg
GMAD14855_S067.jpg
GMAD14902_S067.jpg
GMAD14864_S067.jpg
GMAD14858_S067.jpg
GMAD14867_S067.jpg

[ step02_specified_subject_get_slice_train_val_test.py ] 脚本的另一个功能:将对应的slice移动到指定的目录下;目的是将图像放于deep learning所需的目录(此处包含:train、validation、test等3个目录),代码如下所示:

def move_slice_to_train_val_test_fold(slice_path_txt_list, slice_txt_path, target_path):
	if os.path.exists(slice_txt_path):
		with open(slice_txt_path,"r") as slice_list_file:
			for slice_name in slice_list_file:
				slice_name = slice_name.replace("\n", "")
				try:
					if (slice_name.split(".")[1] == "jpg"):
						slice_pos = os.path.join(slice_path_txt_list, slice_name)
						# print("slice_pos = {}".format(slice_pos))
						target_slice_pos = os.path.join(target_path, slice_name)
						# print("target_slice_pos = {}".format(target_slice_pos))
						shutil.copyfile(slice_pos, target_slice_pos)
				except:
					pass
	else:
		print("slice txt is not exist...")

至此数据已经整理完成,可以使用 deep learning 模型对数据进行拟合,做二分类(AD v.s. NC),目录结构如下所示:

 

reserch@lab406-research:~/documents/deeplearning/alzheimers_disease/ADNI-825-Slice/experiments_FineTunning/single_subject_data_fold_01_train_val_test_entropy_keep_SliceNum_33$ tree -L 2
.
├── test
│   ├── AD  # 1089
│   └── NC  # 1188
├── train
│   ├── AD  # 14751
│   └── NC  # 17028
└── validation
    ├── AD  # 3862
    └── NC  # 4554

=== 源代码 ===

 

【get_data_root.py】

目的:

#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
import re
import time
import datetime

def write_nii_addr(root_path, file_path, original_doc, gray_matter_doc, white_matter_doc, CSF_doc, lable):
	### 参数解释
	# root_path: 各个模态的根目录, 如 825_Subject_NC, 该变量不随目录的递归而变化. 用于将.txt文档存放于模态的根目录
	# file_path: 该变量随目录的递归而变化, 直到找到.nii为止. 
	# original_doc, gray_matter_doc, white_matter_doc, CSF_doc: 
	# lable: 表示模态所属的类别, 包括AD, NC, pMCI, sMCI, uMCI

	# 遍历 file_path 下所有文件, 包括子目录
	files = os.listdir(file_path)
	for file_name in files:
		_file_path = os.path.join(file_path, file_name)
		if os.path.isdir(_file_path):
			write_nii_addr(root_path, _file_path, original_doc, gray_matter_doc, white_matter_doc, CSF_doc, lable)
		else:
			postfix = file_name.split('.')[1]
			if (postfix == "nii"):
				pre_fix = file_name.split('.')[0]
				# gray_matter
				if (re.match('mwp1', pre_fix)):
					_name = lable + "_gray_matter.txt"
					# print("[xx] = {}".format(file_path))
					with open(os.path.join(root_path, _name),"a") as f:
						f.writelines(_file_path+"\n")

				# white_matter
				elif (re.match('mwp2', pre_fix)):
					_name = lable + "_white_matter.txt"
					with open(os.path.join(root_path, _name),"a") as f:
						f.writelines(_file_path+"\n")

				# CSF
				elif (re.match('wm', pre_fix)):
					_name = lable + "_CSF.txt"
					with open(os.path.join(root_path, _name),"a") as f:
						f.writelines(_file_path+"\n")

				# original
				else:
					_name = lable + "_original.txt"
					with open(os.path.join(root_path, _name),"a") as f:
						f.writelines(_file_path+"\n")
				
				
				# print(os.path.join(file_path))

def create_modal_file(root_path, lable):
	# 文件夹:原图,灰质,白质,脑脊液
	# documents: original, gray_matter, white_matter, CSF
	# 原图:ADNI_002_S_0619_MR_MPR-R__GradWarp__N3_Br_20070411125307309_S15145_I48616.nii
	# 灰质:mwp1ADNI_002_S_0619_MR_MPR-R__GradWarp__N3_Br_20070411125307309_S15145_I48616.nii
	# 白质:mwp2ADNI_002_S_0619_MR_MPR-R__GradWarp__N3_Br_20070411125307309_S15145_I48616.nii
	# 脑脊液:wmADNI_002_S_0619_MR_MPR-R__GradWarp__N3_Br_20070411125307309_S15145_I48616.nii
	original_doc = os.path.join(root_path, "original")
	gray_matter_doc = os.path.join(root_path, "gray_matter")
	white_matter_doc = os.path.join(root_path, "white_matter")
	CSF_doc = os.path.join(root_path, "CSF")

	if not os.path.exists(original_doc):
		print("Create file original_doc = {}".format(original_doc))
		os.makedirs(original_doc)

	if not os.path.exists(gray_matter_doc):
		print("Create file gray_matter_doc = {}".format(gray_matter_doc))
		os.makedirs(gray_matter_doc)

	if not os.path.exists(white_matter_doc):
		print("Create file white_matter_doc = {}".format(white_matter_doc))
		os.makedirs(white_matter_doc)

	if not os.path.exists(CSF_doc):
		print("Create file CSF_doc = {}".format(CSF_doc))
		os.makedirs(CSF_doc)


	# 预先进行备份当前根目录下所有.txt文档并删除它们
	import shutil
	backup_file = os.path.join(root_path, "backup")
	i = datetime.datetime.now()
	date = str(i.year) + str(i.month) + str(i.day)
	if not os.path.exists(backup_file):
		print("Create file backup_file = {}".format(backup_file))
		os.makedirs(backup_file)
	files = os.listdir(root_path)
	for file in files:
		print("[backup] file = {}".format(file))
		if not os.path.isdir(file):
			if (len(file.split('.'))>1):
				if (file.split('.')[1] == "txt"):
					old_name = file
					new_name = date + "_" + str(file)
					print("old_name = {}".format(old_name))
					print("new_name = {}".format(new_name))
					os.rename(os.path.join(root_path, old_name), os.path.join(root_path, new_name))
					source_dir = os.path.join(root_path, new_name)
					target_dir = os.path.join(root_path, "backup")
					shutil.copy(source_dir, target_dir)
					os.remove(source_dir)

	# 逻辑程序
	file_path = root_path
	write_nii_addr(root_path, file_path, original_doc, gray_matter_doc, white_matter_doc, CSF_doc, lable)


#递归遍历/root目录下所有文件
if __name__=="__main__":
	# root_path_AD = '.\825_Subject_AD'
	# create_modal_file(root_path_AD, "AD")

	root_path_NC = '.\825_Subject_NC'
	create_modal_file(root_path_NC, "NC")

	# root_path_pMCI = '.\825_subject_pMCI'
	# create_modal_file(root_path_pMCI, "pMCI")

	# root_path_uMCI = '.\825_Subject_sMCI'
	# create_modal_file(root_path_uMCI, "sMCI")

	# root_path_sMCI = '.\825_Subject_uMCI'
	# create_modal_file(root_path_sMCI, "uMCI")
	
	
### run it 
### python .\get_data.py > result.txt

【get_slices_root.m】

function num_list = get_slices_root(filelistname, save_path, slice_stride)
%% 参数说明
% filelistname:./AD_gray_matter.txt(该文件在根目录)
% save_path:./AD_gray_matter_Slices/(是一个文件路径)
% slice_stride:切片图的步长,默认为1

%%
    % clc
    % clear all
    % filelistname = '/home/kyuuki/Documents/py/pMCIlist.txt';
    % save_path = '/home/kyuuki/Documents/MriSlices2/pMCI/';
    % AD_gray_file_list = './AD_gray_matter.txt';
    % NC_gray_file_list = './NC_gray_matter.txt';
    % all_filelistname = [AD_gray_file_list, NC_gray_file_list];
    % for i=1:length(all_filelistname)
    %     all_filelistname(i)
    % end
    
    % filelistname = './AD_gray_matter.txt';
    % save_path = './AD_gray_matter_Slices/';
    
    %% 输入命令
    % num_list = get_slices_root('./AD_gray_matter.txt', './AD_gray_matter_Slices/', 1)
    % num_list = get_slices_root('./NC_gray_matter.txt', './NC_gray_matter_Slices/', 1)
    % num_list = get_slices_root('./AD_original.txt', './AD_original_Slices/', 1)
    % num_list = get_slices_root('./NC_original.txt', './NC_original_Slices/', 1)
    % num_list = get_slices_root('./AD_white_matter.txt', './AD_white_matter_Slices/', 1)
    % num_list = get_slices_root('./NC_white_matter.txt', './NC_white_matter_Slices/', 1)

    %%
    % tline = fgetl(fid): 从文件中读取一行数据,并去掉行末的换行符
    % feof(fid): 文件指针 fid 到达文件末尾时返回“真”值;否则返回“假”。
        % while ~feof 表示 若 未读到文件末尾 则 继续 循环;
        % while feof 表示 若 未读到文件末尾 则 终止 循环,所以只循环一次就终止

    fpn = fopen(filelistname,'rt');
    sbj_conter = 0;
    filelist = {};
    while feof(fpn) ~= 1
        sbj_conter = sbj_conter + 1;
        tline = fgetl(fpn);
        % disp(tline);
        filelist{sbj_conter, 1} = tline;
    end

    fclose(fpn);

    %%
    for i = 1 : sbj_conter
        %% 5-fold
        if i <= sbj_conter * 0.2
            sub = 1;
        else
            if i <= sbj_conter * 0.4
                sub = 2;
            else
                if i <= sbj_conter * 0.6
                    sub = 3;
                else
                    if i <= sbj_conter * 0.8
                        sub = 4;
                    else 
                        sub = 5;
                    end
                end
            end
        end
        subfold = strcat('sub',num2str(sub),'\');
        %%           
        %slice_conter = 0;
        file = filelist{i ,1};
        % file = strcat('..\', file, '\');
        disp(sprintf('[%d]...... %s', i, file));
        try
            nii = load_nii(file);
        catch
            disp('load_nii() failure...Using load_untouch_nii()')
            load_untouch_nii(file);
        end
        
        %mkdir(save_path,strcat('S',num2str(sbj_conter,'%03d')))
        sbj_fold = strcat(save_path, subfold, strcat('S',num2str(i,'%03d')), '\');    
        img = nii.img;
        % img = mapmm(img);
        [x,y,z] = size(img);
        % Z
        mkdir(sbj_fold,'ZSlice');
        ZSlicepath = strcat(sbj_fold,'ZSlice','\');
        for j = 1:slice_stride:z
            slice = img(:,:,j);
            slice_path = strcat(ZSlicepath,'slice_Z',num2str(j),'.jpg');
            if exist(slice_path)>0
                disp(fprintf('[exist - delete] slice_path = %s  \r\n', slice_path));
                delete(slice_path);
            end
            imwrite(slice, slice_path, 'Quality', 100)
        end
        %% Y
        mkdir(sbj_fold,'YSlice');
        YSlicepath = strcat(sbj_fold,'YSlice','\');
        for j = 1:slice_stride:y
            slice = reshape(img(:,j,:),[x,z]);
            slice_path = strcat(YSlicepath,'slice_Y',num2str(j),'.jpg');
            if exist(slice_path)>0
                disp(fprintf('[exist - delete] slice_path = %s  \r\n', slice_path));
                delete(slice_path);
            end
            imwrite(slice, slice_path, 'Quality', 100)
        end    
        %% X
        mkdir(sbj_fold,'XSlice');
        XSlicepath = strcat(sbj_fold,'XSlice','\');
        for j = 1:slice_stride:x
            slice = reshape(img(j,:,:),[y,z]);
            slice_path = strcat(XSlicepath,'slice_X',num2str(j),'.jpg');
            if exist(slice_path)>0
                disp(fprintf('[exist - delete] slice_path = %s  \r\n', slice_path));
                delete(slice_path);
            end
            imwrite(slice, slice_path, 'Quality', 100)
        end
    end
    
    num_list = sbj_conter;
end

【get_slice_path.py】

#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
import re
import datetime


def write_nii_addr(root_path, save_file, last_root_path):
	### 参数解释
	# root_path: 各个模态的根目录, 如 825_Subject_NC, 该变量不随目录的递归而变化. 用于将.txt文档存放于模态的根目录
	# file_path: 该变量随目录的递归而变化, 直到找到.nii为止. 
	# original_doc, gray_matter_doc, white_matter_doc, CSF_doc: 
	# lable: 表示模态所属的类别, 包括AD, NC, pMCI, sMCI, uMCI

	# 遍历 file_path 下所有文件, 包括子目录
	files = os.listdir(root_path)
	# _file_path = root_path
	for file_name in files:
		# print("file_name = {}".format(file_name))
		next_root_path = os.path.join(root_path, file_name)
		# print("next_root_path = {}".format(next_root_path))
		# print("last_root_path = {}".format(last_root_path))
		if os.path.isdir(next_root_path):
			last_root_path = root_path
			root_path = next_root_path
			# print("root_path = {}".format(root_path))
			# print("last_root_path = {}".format(last_root_path))
			write_nii_addr(root_path, save_file, last_root_path)

			# select satisified path and save it .
			selected_path = select_slice_path(root_path)
			if (selected_path != "NONE"):
				save_file.writelines(selected_path + "\n")

			root_path = last_root_path	# 递归遍历的回馈 - feed-back
			# break
		# else:
			# print("break")
			# print("root_path = {}".format(root_path))
			# print("last_root_path = {}".format(last_root_path))
			# print("next_root_path = {}".format(next_root_path))
			# root_path = last_root_path	# feedback
			# print("---")
			# break
		# else:
			# print("root_path = {}".format(root_path))
			# postfix = file_name.split('.')[1]
			# if (postfix == "nii"):
			# 	pre_fix = file_name.split('.')[0]
			# 	# gray_matter
			# 	if (re.match('mwp1', pre_fix)):
			# 		_name = lable + "_gray_matter.txt"
			# 		# print("[xx] = {}".format(root_path))
			# 		with open(os.path.join(root_path, _name),"a") as f:
			# 			f.writelines(_file_path+"\n")
				
def select_slice_path(file_path):
	# satisified_path = ['XSlice', 'YSlice', 'ZSlice']
	satisified_path = ['YSlice']
	target_file = "NONE"
	for item in satisified_path:
		if item in file_path:
			print("file_path = {}".format(file_path))
			target_file = file_path
			break

	return target_file

def execute(root_path, save_file_name):
	save_file_path = os.path.join(root_path, save_file_name)
	# print("save_file_path = {}".format(save_file_path))
	if os.path.exists(save_file_path):
		i = datetime.datetime.now()
		date = str(i.year) + str(i.month) + str(i.day) + str(i.hour) + str(i.minute) + str(i.second)
		new_name = save_file_path +".bak" + date
		os.rename(save_file_path, new_name)
		print("copied and deleted file, new_name = {}".format(new_name))
		# os.remove(save_file_path)

	with open(save_file_path,"a") as save_file:
			write_nii_addr(root_path, save_file, "")
	print("DONE... root_path = {}".format(root_path))

#递归遍历/root目录下所有文件
if __name__=="__main__":
	# root_path = './AD_NC_except_entropy_zero/AD_gray_matter_Slices_except_entropy_zero'
	# save_file_name = 'AD_gray_matter_Slices_path.txt'
	# execute(root_path, save_file_name)

	# root_path = './AD_NC_except_entropy_zero/NC_gray_matter_Slices_except_entropy_zero'
	# save_file_name = 'NC_gray_matter_Slices_path.txt'
	# execute(root_path, save_file_name)

	# root_path = './AD_white_matter_Slices'
	# save_file_name = 'AD_white_matter_Slices_path.txt'
	# execute(root_path, save_file_name)

	# root_path = './NC_white_matter_Slices'
	# save_file_name = 'NC_white_matter_Slices_path.txt'
	# execute(root_path, save_file_name)


	root_path = './AD_NC_except_entropy_zero/AD_gray_matter_Slices_except_entropy_zero'
	save_file_name = 'AD_gray_matter_Slices_path_y.txt'
	execute(root_path, save_file_name)

	root_path = './AD_NC_except_entropy_zero/NC_gray_matter_Slices_except_entropy_zero'
	save_file_name = 'NC_gray_matter_Slices_path_y.txt'
	execute(root_path, save_file_name)

【cal_entropy_slices.m】

 

function [total_num_entropy_cal, slice_list] = cal_entropy_slices(file_list_name)
    % close all; clear all; clc;
    %% example: 
    % file_list_name = AD_gray_matter_Slices
    % slice_path_file = AD_gray_matter_Slices_path.txt
    % entropy_value_file = entropy_value_AD_gray_matter_Slices.txt
    
    %% 
    total_num_entropy_cal = 0;
    slice_path_file = strcat('\', file_list_name, '_path.txt');
    file_path = strcat('.\', file_list_name, slice_path_file);
    fpn = fopen(file_path,'rt');
    num_dir = 0;
    file_list = {};
    while feof(fpn) ~= 1
        num_dir = num_dir + 1;
        tline = fgetl(fpn);
        % disp(tline);
        file_list{num_dir, 1} = tline;
    end
    fclose(fpn);
    
    entropy_value_file = strcat('entropy_value_', file_list_name, '.txt');
    
    %% 计算切片图的 entropy
    for i = 1:num_dir
        cur_num_entropy_cal = 0;
        dir_path = file_list{i ,1};
        %% save file
        save_path = strcat(dir_path, '\', entropy_value_file);
        
        
        %% 判断文件是否存在,如果存在先删除
        if exist(save_path)>0
            delete(save_path);
            disp(fprintf('Detele file [%s] .',save_path));
        end
        save_file = fopen(save_path, 'a');
        % disp(dir_path);
        %% calculate entropy value 
        %% 遍历文件夹下的所有文件:http://blog.csdn.net/chenriwei2/article/details/42321851
        slice_list = dir(fullfile(dir_path));
        num_slice = size(slice_list,1);
        %% 存储在数组中,为了对entropy进行排序
        % entropy_val_array = zeros(num_slice - 2, 1);
        % entropy_val_struct = struct('value', zeros(num_slice - 2, 1), 'name', {});
        sorted_entropy_value_arr = zeros(num_slice - 2, 1);
        sorted_name_cell = {};
        for slice_list_index = 3:num_slice
            % slice_list(3) = 'entropy_value_AD_gray_matter_Slices.txt'
            image_name = slice_list(slice_list_index).name;
            slice_path = strcat(dir_path, '\', image_name);
            try
                cur_num_entropy_cal = cur_num_entropy_cal + 1;
                image = imread(slice_path);
                entropy_value = entropy(image);
                try
                    % entropy_val_struct.value(cur_num_entropy_cal) = entropy_value;
                    % entropy_val_struct.name{cur_num_entropy_cal, 1} = image_name;
                    sorted_entropy_value_arr(cur_num_entropy_cal) = entropy_value;
                    sorted_name_cell{cur_num_entropy_cal, 1} = image_name;
                catch
                    disp('entropy_val_struct error.');
                    % exit(0);
                end
            catch
                disp(fprintf('File [%s] is not a .jpg file.',slice_path));
            end
        end
        
        % sorted_entropy_value:根据entropy的值进行降序后的数组
        % sorted_name_index:每个值对应的索引,为了找到对应的图片名
        [sorted_entropy_value, sorted_name_index] = sort([sorted_entropy_value_arr], 'descend');
        
        % sorted_name_index
        for index = 1:cur_num_entropy_cal
            try
                value_ = sorted_entropy_value(index);
                name_ = sorted_name_cell{sorted_name_index(index), 1};
                fprintf(save_file,'%s, %d \r\n', name_, value_);
                % fprintf(save_file, '\r\n');
            catch
                disp(fprintf('[error] index = %d', index));
            end
            
        end
        
        fclose(save_file);
        
        % disp(sprintf('num_entropy_cal = %d', cur_num_entropy_cal));
        total_num_entropy_cal = total_num_entropy_cal + cur_num_entropy_cal;
    end
    %%
    disp(sprintf('num_entropy_cal = %d', total_num_entropy_cal));
    
    %% 输入
    % [total_num_entropy_cal, slice_list] = cal_entropy_slices('AD_gray_matter_Slices')
    % [total_num_entropy_cal, slice_list] = cal_entropy_slices('NC_gray_matter_Slices')
end

【delete_slice_N.m】

function [deleted_slice_num] = delete_slice_N(file_list_name, num_not_delete)
    deleted_slice_num = 0;
    %% example: 
    % file_list_name = AD_gray_matter_Slices
    % slice_path_file = AD_gray_matter_Slices_path.txt
    % entropy_value_file = entropy_value_AD_gray_matter_Slices.txt
    % num_not_delete = 61;   % 需要保留的slice个数
    %% 
    slice_path_file = strcat('\', file_list_name, '_path.txt');
    file_path = strcat('.\', file_list_name, slice_path_file);
    fpn = fopen(file_path,'rt');
    num_dir = 0;
    file_list = {};
    while feof(fpn) ~= 1
        num_dir = num_dir + 1;
        tline = fgetl(fpn);
        % disp(tline);
        file_list{num_dir, 1} = tline;
    end
    fclose(fpn);
    
    entropy_value_file = strcat('entropy_value_', file_list_name, '.txt');
    
    %%
    for i = 1:num_dir
        dir_path = file_list{i ,1};
        %% save file
        Slices_path = strcat(dir_path, '\', entropy_value_file);
        % Slices_path = 
        %% 判断文件是否存在,如果存在先删除
        if exist(Slices_path)>0
            disp(fprintf('File [%s] exist...', Slices_path));
            Slices_path_file = fopen(Slices_path, 'rt');
            slice_cell = {};
            slice_num = 0;
           %% 依次遍历每个slice对应的entropy_value文档,根据名字删除entropy的值比较低的slice
            % slice_X106.jpg, 5.665666e-01
            while feof(Slices_path_file) ~= 1
                slice_num = slice_num + 1;
                tline = fgetl(Slices_path_file);
                % disp(tline);
                slice_cell{slice_num, 1} = tline;
            end
            fclose(Slices_path_file);
             
            %%
            if (num_not_delete < slice_num)
                for i = (num_not_delete+1):slice_num
                    % delete those low entropy slice
                    slice_line = slice_cell{i, 1};
                    slice_line_split = regexp(slice_line, ',', 'split');
                    slice_name = strtrim(char(slice_line_split(1)));
                    slice_entropy_value = strtrim(char(slice_line_split(2)));
                    try
                        % 排除文件名为空
                        if strcmp(slice_name, '')
                            disp(fprintf('slice_name = %s is null. \r\n', slice_name));
                        else
                            delete_slice_path = strcat(dir_path, '\', slice_name);
                            if exist(delete_slice_path)>0
                                delete(delete_slice_path);
                                deleted_slice_num = deleted_slice_num + 1;
                                disp(fprintf('[deleted] %s \r\n', slice_line));
                            else
                                disp(fprintf('[Not exist] slice_line = %s not exist. \r\n', slice_line));
                            end
                        end
                        
                    catch
                        disp(fprintf('[detele error] slice_name = %d', slice_name));
                    end
                    % delete end ...
                end
            else
                disp('[error] num_not_delete >= slice_num ......');
            end

        end
        
    end
    
    %% 输入
    % [deleted_name_list] = delete_slice_N('AD_gray_matter_Slices', 101)
    % [deleted_name_list] = delete_slice_N('NC_gray_matter_Slices', 101)

end

 

你可能感兴趣的:(【医疗图像处理】)