Li‘s 影像组学radiomics视频学习笔记(36)-聚类树状图Dendrogram的python实现

作者:北欧森林
链接:https://www.jianshu.com/p/8cf61983d22a
来源:简书,已获转载授权

RadiomicsWorld.com “影像组学世界”论坛:
影像组学世界/RadiomicsWorld

本笔记来源于B站Up主: 有Li 的影像组学系列教学视频
本节(36)主要介绍: 聚类树状图Dendrogram的python实现

应该注意一下scipy版本的问题:scipy 1.5.0版本画聚类树状图要报错,1.5.2或者1.2.1版本就没有问题。

# modified from https://www.machinelearningplus.com/plots/top-50-matplotlib-visualizations-the-master-plots-python/
 
import matplotlib.pyplot as plt
import pandas as pd
import scipy.cluster.hierarchy as shc
 
# import data
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/USArrests.csv')
print(df)
 
# Plot
plt.figure(figsize=(16, 10), dpi= 80)  
plt.title("USArrests Dendograms", fontsize=22)  
dend = shc.dendrogram(shc.linkage(df[['Murder', 'Assault', 'UrbanPop', 'Rape']],
                                  method='ward'), labels=df.State.values, color_threshold=100)  
 
plt.xticks(fontsize=12)
#plt.savefig('USArrests_Dendograms.png')
plt.show()

Li‘s 影像组学radiomics视频学习笔记(36)-聚类树状图Dendrogram的python实现_第1张图片

import numpy as np
import pandas as pd
from sklearn.utils import shuffle
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LassoCV
 
xlsx1_filePath = '/Users/Mac/Documents/JianShuNotes/data/aa.xlsx'
xlsx2_filePath = '/Users/Mac/Documents/JianShuNotes/data/bb.xlsx'
data_1 = pd.read_excel(xlsx1_filePath)
data_2 = pd.read_excel(xlsx2_filePath)
rows_1,__ = data_1.shape
rows_2,__ = data_2.shape
data_1.insert(0,'label',[0]*rows_1)
data_2.insert(0,'label',[1]*rows_2)
data = pd.concat([data_1,data_2])
data = shuffle(data)
data = data.fillna(0)
X = data[data.columns[1:]]
y = data['label']
colNames = X.columns
X = X.astype(np.float64)
X = StandardScaler().fit_transform(X) #new knowledge
X = pd.DataFrame(X)
X.columns = colNames
 
# LASSO
alphas = np.logspace(-3,1,50)
model_lassoCV = LassoCV(alphas = alphas, cv = 10, max_iter = 100000).fit(X,y) #cv, cross-validation
print(model_lassoCV.alpha_)
 
coef = pd.Series(model_lassoCV.coef_,index = X.columns) #new knowledge
# print(coef)
print("Lasso picked " + str(sum(coef != 0)) + " variables and eliminated the other " + str(sum(coef == 0))+ 'variables')
 
print(coef[coef != 0])
X = X[coef[coef != 0].index]
 
print(X.head())
 
# Plot
plt.figure(figsize=(5, 5), dpi= 80)  
plt.title("Radiomic Dendograms", fontsize=22)  
dend = shc.dendrogram(shc.linkage(X[:].T, method='ward'), labels=X.columns, color_threshold=20)  #参数调整
plt.xticks(fontsize=12,rotation = 60, ha = 'right')
plt.show()

Li‘s 影像组学radiomics视频学习笔记(36)-聚类树状图Dendrogram的python实现_第2张图片

延伸阅读:

  • Agglomerative Clustering and Dendrograms - Explained
  • 聚类树状图_聚集聚类和树状图-解释

你可能感兴趣的:(Li's,影像组学视频学习笔记,python,聚类,机器学习,scipy,神经网络)