【pytorch】一个函数帮你找到合适的 batch_size

训练模型时,使用什么 batch_size 能够帮我们最大化利用 GPU 的性能?
给你一个函数,帮助你快速找到合适的 batch_size!
参考:原文链接

函数定义

import time
def proc_time(b_sz, model, n_iter=10):
    # 模型输入部分
    x = torch.rand(b_sz, 16, 11).cuda()  # <----- 在这里设置输入的形状 
    
    torch.cuda.synchronize()
    start = time.time()
    for _ in range(n_iter):
        model(x)                         # <---- 模型输入
    torch.cuda.synchronize()
    end = time.time() - start
    throughput = b_sz * n_iter / end
    print(f"Batch: {b_sz} \t {throughput} samples/sec")
    return (b_sz, throughput, )

函数调用

for b_sz in [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048]:
    proc_time(b_sz, model)
Batch: 1 	 	16.793156063735697 samples/sec
Batch: 2 	 	38.83115043526805 samples/sec
Batch: 4 	 	77.96799714472667 samples/sec
Batch: 8 	 	153.83649638382983 samples/sec
Batch: 16 	 	304.7619878029563 samples/sec
Batch: 32 	 	600.1129780317017 samples/sec
Batch: 64 	 	1350.1580643181849 samples/sec
Batch: 128 	 	2644.7298943577844 samples/sec
Batch: 256 	 	5297.651717512998 samples/sec
Batch: 512 	 	9337.831389005929 samples/sec
Batch: 1024 	 14020.95845977864 samples/sec
Batch: 2048 	 16672.3204029026 samples/sec

你可能感兴趣的:(python,pytorch,batch,深度学习)