常用文件读取——numpy文件读取

NumPy IO

Numpy 可以读写磁盘上的文本数据或二进制数据。

NumPy 为 ndarray 对象引入了一个简单的文件格式:npy。

npy 文件用于存储重建 ndarray 所需的数据、图形、dtype 和其他信息。

常用的 IO 函数有:

  • load() 和 save() 函数是读写文件数组数据的两个主要函数,默认情况下,数组是以未压缩的原始二进制格式保存在扩展名为 .npy 的文件中。
  • savze() 函数用于将多个数组写入文件,默认情况下,数组是以未压缩的原始二进制格式保存在扩展名为 .npz 的文件中。
  • loadtxt() 和 savetxt() 函数处理正常的文本文件(.txt 等)

numpy.save() 将数组保存到以 .npy 为扩展名的文件中

numpy.save(file, arr, allow_pickle=True, fix_imports=True)

参数说明:

  • file:要保存的文件,扩展名为 .npy,如果文件路径末尾没有扩展名 .npy,该扩展名会被自动加上。
  • arr: 要保存的数组
  • allow_pickle: 可选,布尔值,允许使用 Python pickles 保存对象数组,Python 中的 pickle 用于在保存到磁盘文件或从磁盘文件读取之前,对对象进行序列化和反序列化。
  • fix_imports: 可选,为了方便 Pyhton2 中读取 Python3 保存的数据。

np.savez()——将多个数组保存到以 npz 为扩展名的文件中

numpy.savez(file, *args, **kwds)

参数说明:

  • file:要保存的文件,扩展名为 .npz,如果文件路径末尾没有扩展名 .npz,该扩展名会被自动加上。
  • args: 要保存的数组,可以使用关键字参数为数组起一个名字,非关键字参数传递的数组会自动起名为 arr_0, arr_1, … 。
  • kwds: 要保存的数组使用关键字名称。

savetxt()

savetxt() 函数是以简单的文本文件格式存储数据,对应的使用 loadtxt() 函数来获取数据。

np.loadtxt(FILENAME, dtype=int, delimiter=' ')
np.savetxt(FILENAME, a, fmt="%d", delimiter=",")

参数 delimiter 可以指定各种分隔符、针对特定列的转换器函数、需要跳过的行数等。

1.numpy.load(file, mmap_mode=None, allow_pickle=True, fix_imports=True, encoding='ASCII')

  从.npy,.npz或pickle文件加载数组或pickle对象

Parameters:

file : file-like object, string, or pathlib.Path

The file to read. File-like objects must support the seek() and read() methods. Pickled files require that the file-like object support the readline() method as well.

mmap_mode : {None, ‘r+’, ‘r’, ‘w+’, ‘c’}, optional

If not None, then memory-map the file, using the given mode (see numpy.memmap for a detailed description of the modes). A memory-mapped array is kept on disk. However, it can be accessed and sliced like any ndarray. Memory mapping is especially useful for accessing small fragments of large files without reading the entire file into memory.

allow_pickle : bool, optional

Allow loading pickled object arrays stored in npy files. Reasons for disallowing pickles include security, as loading pickled data can execute arbitrary code. If pickles are disallowed, loading object arrays will fail. Default: True

fix_imports : bool, optional

Only useful when loading Python 2 generated pickled files on Python 3, which includes npy/npz files containing object arrays. If fix_imports is True, pickle will try to map the old Python 2 names to the new names used in Python 3.

encoding : str, optional

What encoding to use when reading Python 2 strings. Only useful when loading Python 2 generated pickled files on Python 3, which includes npy/npz files containing object arrays. Values other than ‘latin1’, ‘ASCII’, and ‘bytes’ are not allowed, as they can corrupt numerical data. Default: ‘ASCII’

Returns:

result : array, tuple, dict, etc.

Data stored in the file. For .npz files, the returned instance of NpzFile class must be closed to avoid leaking file descriptors.

Raises:

IOError

If the input file does not exist or cannot be read.

ValueError

The file contains an object array, but allow_pickle=False given.

2 numpy.save(file, arr, allow_pickle=True, fix_imports=True

Parameters:

file : file, str, or pathlib.Path

File or filename to which the data is saved. If file is a file-object, then the filename is unchanged. If file is a string or Path, a .npy extension will be appended to the file name if it does not already have one.

allow_pickle : bool, optional

Allow saving object arrays using Python pickles. Reasons for disallowing pickles include security (loading pickled data can execute arbitrary code) and portability (pickled objects may not be loadable on different Python installations, for example if the stored objects require libraries that are not available, and not all pickled data is compatible between Python 2 and Python 3). Default: True

fix_imports : bool, optional

Only useful in forcing objects in object arrays on Python 3 to be pickled in a Python 2 compatible way. If fix_imports is True, pickle will try to map the new Python 3 names to the old module names used in Python 2, so that the pickle data stream is readable with Python 2.

arr : array_like

Array data to be saved.

案例

tensorflow 将ckpt中的参数存储为 npy 格式

reader=pywrap_tensorflow.NewCheckpointReader(FILE_PATH)
    var_to_shape_map=reader.get_variable_to_shape_map()
    for key in var_to_shape_map:
        print(key)
 
 
    layers = ['conv1_1', 'conv1_2', 'conv2_1', 'conv2_2','conv3_1', 'conv3_2','conv3_3', 'conv4_1', 'conv4_2','conv4_3','conv5_1','conv5_2', 'conv5_3', 'fc6', 'fc7', 'cls_score','bbox_pred','rpn_cls_score','rpn_bbox_pred']
    data = {
        'conv1_1': [],
        'conv1_2': [],
        'conv2_1': [],
        'conv2_2': [],
        'conv3_1': [],
        'conv3_2': [],
        'conv3_3': [],
        'conv4_1': [],
        'conv4_2': [],
        'conv4_3': [],
        'conv5_1': [],
        'conv5_2': [],
        'conv5_3': [],
        'fc6': [],
        'fc7': [],
        'cls_score_na': [],
        'bbox_pred_na':[],
        'rpn_cls_score_na': [],
        'rpn_bbox_pred_na':[]
    }
 
    for op_name in layers:
 
        biases_variable = reader.get_tensor(op_name+'/biases')
        weights_variable = reader.get_tensor(op_name+'/weights')
        tmp={'biases':biases_variable,'weights':weights_variable}
 
        data[op_name] = tmp
    np.save(OUTPUT_FILE, data)

读取.npy 文件

import numpy as np
#加载python2保存的文件,需要定义encoding
#ndarray.item(*args) 	复制数组中的一个元素,并返回
npz=np.load('vgg19.npy',encoding='latin1').item()

print(type(npz))

输出结果

 

 

你可能感兴趣的:(文件读取)