优雅的操作张量维度(rearrange)和便携式矩阵乘法(einsum )

目录

1、rearrange

2、repeat

3、reduce

4、附录

4.1 对应图像块切片

4.2 嵌入到pytorch层中

 4.3 torch.einsum 多维线性表达式的方法


        einops是一个简洁优雅操作张量的库,并且支持对numpypytorchtensorflow中的张量进行操作,该库最大的优点是函数的使用逻辑清晰明了,其中中常用的三个函数分别是rearrangerepeatreduce

  • rearrange: 用于对张量的维度进行重新变换排序,可用于替换pytorch中的reshape,view,transpose和permute等操作
  • repeat: 用于对张量的某一个维度进行复制,可用于替换pytorch中的repeat
  • reduce: 类似于tensorflow中的reduce操作,可以用于求平均值,最大最小值的同时压缩张量维度。分别有’max’,‘min’,‘sum’,‘mean’,‘prod’。

1、rearrange

import torch
from einops import rearrange

images = torch.randn((32,30,40,3))
# (32, 30, 40, 3)
print(rearrange(images, 'b h w c -> b h w c').shape)

# (960, 40, 3)
print(rearrange(images, 'b h w c -> (b h) w c').shape)

# (30, 1280, 3)
print(rearrange(images, 'b h w c -> h (b w) c').shape)

# (32, 3, 30, 40)
print(rearrange(images, 'b h w c -> b c h w').shape)

# (32, 3600)
print(rearrange(images, 'b h w c -> b (c h w)').shape)

# ---------------------------------------------
# 这里(h h1) (w w1)就相当于h与w变为原来的1/h1,1/w1倍

# (128, 15, 20, 3)
print(rearrange(images, 'b (h h1) (w w1) c -> (b h1 w1) h w c', h1=2, w1=2).shape)

# (32, 15, 20, 12)
print(rearrange(images, 'b (h h1) (w w1) c -> b h w (c h1 w1)', h1=2, w1=2).shape)

2、repeat

import torch
from einops import repeat

image = torch.randn((30,40))

# 整体复制 (30, 40, 3)
print(repeat(image, 'h w -> h w c', c=3).shape)

# 按行复制 (60, 40)
print(repeat(image, 'h w -> (repeat h) w', repeat=2).shape)

# 按列复制 (30, 120) 注意:(repeat w)与(w repeat)结果是不同的
print(repeat(image, 'h w -> h (repeat w)', repeat=3).shape)

# (60, 80)
print(repeat(image, 'h w -> (h h2) (w w2)', h2=2, w2=2).shape)

3、reduce

import torch
from einops import reduce

x = torch.randn(3, 5, 5)
# (5, 5)
print(reduce(x, 'c h w -> h w', 'max').shape)

x = torch.randn(1, 3, 6, 6)
# (1, 3, 3, 3) 注意:如果不是整除会报错
y1 = reduce(x, 'b c (h h1) (w w1) -> b c h w', 'max', h1=2, w1=2)
print(y1.shape)

# Adaptive max-pooling:(1, 3, 3, 2)
print(reduce(x, 'b c (h h1) (w w1) -> b c h1 w1', 'max', h1=3, w1=2).shape)

# Global average pooling:(1, 3)
print(reduce(x, 'b c h w -> b c', 'mean').shape)

4、附录

4.1 对应图像块切片

import torch
from einops import rearrange

image = torch.randn(1, 3, 10, 10)
# (1, 4, 75)
# rearrange(image, 'b c (h h1) (w w1) -> b (h w) (h1 w1 c)', h1=p, w1=p)
print(rearrange(image, 'b c (h h1) (w w1) -> b (h w) (h1 w1 c)', h1=5, w1=5).shape)

4.2 嵌入到pytorch层中

import torch.nn as nn
from einops.layers.torch import Rearrange

model = nn.Sequential(
    nn.Conv2d(3, 6, kernel_size=5),
    nn.MaxPool2d(kernel_size=2),
    nn.Conv2d(6, 16, kernel_size=5),
    nn.MaxPool2d(kernel_size=2),
    # flattening
    Rearrange('b c h w -> b (c h w)'),  
    nn.Linear(16*5*5, 120), 
    nn.ReLU(),
    nn.Linear(120, 10), 
)

 4.3 torch.einsum 多维线性表达式的方法

import torch

a = torch.randn((1,1,3,2))
b = torch.randn((1,1,1,2))

# 或 torch.einsum('b h i d, b h j d -> b h i j', [a,b])
# 相当于 torch.matmul(a,b.transpose(2,3))
c = torch.einsum('b h i d, b h j d -> b h i j', a, b)
print(c.shape)

你可能感兴趣的:(技巧分享,tensorflow,深度学习,人工智能)