本文使用的mnist例子来自pytorch官网:
pytorch官网mnist例子链接
authorName: default
experimentName: mnist_pytorch
trialConcurrency: 1
maxExecDuration: 100h
maxTrialNum: 5
#choice: local, remote, pai
trainingServicePlatform: local
searchSpacePath: search_space.json
#choice: true, false
useAnnotation: false
tuner:
#choice: TPE, Random, Anneal, Evolution, BatchTuner
#SMAC (SMAC should be installed through nnictl)
builtinTunerName: TPE
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
trial:
command: python main.py
codeDir: .
gpuNum: 1
这里只设置了batch-size,epochs,和学习率三个参数。可根据需要设置更多的参数
{
"batch-size": {"_type":"choice", "_value": [32,64]},
"epochs": {"_type":"choice", "_value": [10]},
"lr": {"_type":"choice","_value":[0.0001, 0.01, 0.1]}
}
RCV_CONFIG = nni.get_next_parameter()
代码:
RCV_CONFIG = nni.get_next_parameter()
_logger.debug(RCV_CONFIG)
parser = argparse.ArgumentParser(description='PyTorch MNIST Example')
parser.add_argument('--batch-size', type=int, default=RCV_CONFIG['batch-size'], metavar='N',help='input batch size for training (default: 64)')
parser.add_argument('--epochs', type=int, default=RCV_CONFIG['epochs'], metavar='N',
help='number of epochs to train (default: 10)')
parser.add_argument('--lr', type=float, default=RCV_CONFIG['lr'], metavar='LR',
help='learning rate (default: 0.01)')
nni.report_intermediate_result(loss.item())
这里report的是训练时的loss
代码:
def train(args, model, device, train_loader, optimizer, epoch):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
optimizer.step()
if batch_idx % args.log_interval == 0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.item()))
nni.report_intermediate_result(loss.item())
nni.report_final_result(best_acc)
这里report的是测试精度
代码:
for epoch in range(1, args.epochs + 1):
train(args, model, device, train_loader, optimizer, epoch)
test(args, model, device, test_loader)
nni.report_final_result(best_acc)
{"batch-size":64,"epochs":10,"lr":0.0001}
nni.report_intermediate_result(metric)
nni.report_final_result(metric)
每次实验都重复上述流程,直到达到最大的实验次数maxTrialNum(在配置文件中设置)
{"batch-size":64,"epochs":10,"lr":0.0001}
2.Hyper Parameter
如上图,测试精度最高的前80%结果。他们的参数分别为:
{"batch-size":64,"epochs":10,"lr":0.1}
{"batch-size":32,"epochs":10,"lr":0.01}
{"batch-size":64,"epochs":10,"lr":0.1}
3.Trial Duration
可以看出每次实验运行的时间
4.一次实验的结果
如上图,可看出本次试验的参数
如上图,当前训练loss为0.200218066573143
点击查看完整代码