Permeability代码分析

1

## load the label
ts_label_adr = (adr_input + '/permlty/ts_permeability.xlsx')
file_label = pd.ExcelFile(ts_label_adr)
labels = file_label.parse(header = None)
print(type(labels))
# data_indx = np.load('index.npy','r')
indx = np.random.randint(0, 8300, size = 1000)
np.save(open('index.npy','wb'), indx)
labels_sub = labels.iloc[indx]
  • np.save(open('index.npy','wb'), indx)
    该语句是进行save的操作——没有值(print(type(labels)) ==> )。

  • dataframe的格式
    iloc用于dataframe。(如,选取第i行的数据,使用df.loc[[i]]/df.iloc[[i]])
    只要是用panda打开的格式都是dataframe。

  • 本语句的作用:在8300个数据中随机抽取1000个数据用于后面的training testing和validating。
    注意:使用index且存在index.npy中的原因是——data数据和permeability数据一一对应。(抽取的样本的编号的合集)

2 input处理

each sample: 100100100

  • Way 1
    each sample = one vector
  • Way 2
    In each sample: each layer = matrix with size [100*100]
    calculate the void ratio for each layer.

4 Split data into train, test and validation sets

'''
shuffle and split data into train and test sets
'''
def shuffle_split(data, label):
    data = np.array(data)
    label = np.array(label)
    label = 10**11 * label
    if len(data) == len(label):
        print('checked out')
    indx = np.random.permutation(len(data))
    test_size = int(0.2*len(data))
    test_indx = indx[:test_size]
    train_indx = indx[test_size:]
    val_indx = train_indx[:test_size]
    train_indx = train_indx[test_size:]
    train_dat, train_tar, val_dat, val_tar, test_dat, test_tar = data[train_indx], label[train_indx], data[val_indx], label[val_indx], data[test_indx], label[test_indx]
    return train_dat, train_tar, val_dat, val_tar, test_dat, test_tar

train_dat, train_tar, val_dat, val_tar, test_dat, test_tar = shuffle_split(data, labels_sub)

del(data, file_label, labels_sub) # free memory
  • label = 10**11 * label
    permeability数值比较小,增大
  • train, test and validation sets
    遵循一定规律。
or epoch in range(num_epoch):
    print epoch
    optimizer.zero_grad()
    for dat, tar in train_loader: 
        structure = Variable(dat.view(-1,100,100,100))
        permeability = Variable(tar.view(-1,1))
  • view 就像numpy中的reshape一样

目前给我了:
extracted features, the npy index => I only need to import the corresponding permeability data.

你可能感兴趣的:(Permeability代码分析)