Pytorch模型 保存&恢复

keras中常使用 .h5 文件保存模型。而 Pytorch 保存数据的格式为.t7文件 或者 .pt文件 或者 .pkl格式

  • .t7格式 是沿用 torch7 中读取模型权重的方式
  • .pt格式 是Pytorch官方示例推荐使用的格式

保存模型&参数

torch.save(model , 'model.pt')  					# 保存整个网络
torch.save(model.state_dict() , 'model_params.pt')  # 只保存网络中的参数 (速度快, 占内存少)

提取模型

这种方式将会提取 整个网络, 网络大的时候可能会比较慢.

model = torch.load('model.pt')

只提取模型参数

这种方式将会提取所有的参数, 然后再放到你的新建网络中.

def restore_params():
    # 新建 model
    model = torch.nn.Sequential(  torch.nn.Linear(1, 10),
       							  torch.nn.ReLU(),
        						  torch.nn.Linear(10, 1) )
 
    # 将保存的参数复制到 model
    model.load_state_dict(torch.load('model_params.pt'))

官方API :https://pytorch.org/docs/stable/torch.html

torch.save(obj, f, pickle_module=, pickle_protocol=2, _use_new_zipfile_serialization=False)

Saves an object to a disk file.

See also: Recommended approach for saving a model

Parameters

  • obj – saved object
  • f – a file-like object (has to implement write and flush) or a string containing a file name
  • pickle_module – module used for pickling metadata and objects
  • pickle_protocol – can be specified to override the default protocol

WARNING

If you are using Python 2, torch.save() does NOT support StringIO.StringIO as a valid file-like object. This is because the write method should return the number of bytes written; StringIO.write() does not do this.

Please use something like io.BytesIO instead.

Example

>>> # Save to file
>>> x = torch.tensor([0, 1, 2, 3, 4])
>>> torch.save(x, 'tensor.pt')
>>> # Save to io.BytesIO buffer
>>> buffer = io.BytesIO()
>>> torch.save(x, buffer)

torch.load(f, map_location=None, pickle_module=, **pickle_load_args)

Loads an object saved with torch.save() from a file.

torch.load() uses Python’s unpickling facilities but treats storages, which underlie tensors, specially. They are first deserialized on the CPU and are then moved to the device they were saved from. If this fails (e.g. because the run time system doesn’t have certain devices), an exception is raised. However, storages can be dynamically remapped to an alternative set of devices using the map_location argument.

If map_location is a callable, it will be called once for each serialized storage with two arguments: storage and location. The storage argument will be the initial deserialization of the storage, residing on the CPU. Each serialized storage has a location tag associated with it which identifies the device it was saved from, and this tag is the second argument passed to map_location. The builtin location tags are ‘cpu’ for CPU tensors and ‘cuda:device_id’ (e.g. ‘cuda:2’) for CUDA tensors. map_location should return either None or a storage. If map_location returns a storage, it will be used as the final deserialized object, already moved to the right device. Otherwise, torch.load() will fall back to the default behavior, as if map_location wasn’t specified.

If map_location is a torch.device object or a string containing a device tag, it indicates the location where all tensors should be loaded.

Otherwise, if map_location is a dict, it will be used to remap location tags appearing in the file (keys), to ones that specify where to put the storages (values).

User extensions can register their own location tags and tagging and deserialization methods using torch.serialization.register_package().

Parameters

  • f – a file-like object (has to implement read(), :methreadline, :methtell, and :methseek), or a string containing a file name
  • map_location – a function, torch.device, string or a dict specifying how to remap storage locations
  • pickle_module – module used for unpickling metadata and objects (has to match the pickle_module used to serialize file)
  • pickle_load_args – (Python 3 only) optional keyword arguments passed over to pickle_module.load() and pickle_module.Unpickler(), e.g., errors=…

WARNING

torch.load() uses pickle module implicitly, which is known to be insecure. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling. Never load data that could have come from an untrusted source, or that could have been tampered with. Only load data you trust.

NOTE

When you call torch.load() on a file which contains GPU tensors, those tensors will be loaded to GPU by default. You can call torch.load(…, map_location=‘cpu’) and then load_state_dict() to avoid GPU RAM surge when loading a model checkpoint.

NOTE

By default, we decode byte strings as utf-8. This is to avoid a common error case UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0x… when loading files saved by Python 2 in Python 3. If this default is incorrect, you may use an extra encoding keyword argument to specify how these objects should be loaded, e.g., encoding=‘latin1’ decodes them to strings using latin1 encoding, and encoding=‘bytes’ keeps them as byte arrays which can be decoded later with byte_array.decode(…).

Example

>>> torch.load('tensors.pt')
# Load all tensors onto the CPU
>>> torch.load('tensors.pt', map_location=torch.device('cpu'))
# Load all tensors onto the CPU, using a function
>>> torch.load('tensors.pt', map_location=lambda storage, loc: storage)
# Load all tensors onto GPU 1
>>> torch.load('tensors.pt', map_location=lambda storage, loc: storage.cuda(1))
# Map tensors from GPU 1 to GPU 0
>>> torch.load('tensors.pt', map_location={'cuda:1':'cuda:0'})
# Load tensor from io.BytesIO object
>>> with open('tensor.pt', 'rb') as f:
        buffer = io.BytesIO(f.read())
>>> torch.load(buffer)
# Load a module with 'ascii' encoding for unpickling
>>> torch.load('module.pt', encoding='ascii')

你可能感兴趣的:(Deep,Learning,#,Pytorch,Pytorch)