theano篇

theano

theano.tensor.matrix(name=None,dtype=config.floatX)：返回一个2维ndarry变量。
theano.tensor.matrix(name=None,dtype=config.floatX)：返回一个3维的ndarry变量。
theano.tensor里有scalar(一个数据点),vector(向量),matrix(矩阵),tensor3(三维矩阵),tensor4(四维矩阵)。
1.theano.dimshuffle:改变输入维度的顺序，返回原始变量的一个view。输入是一个包含[0, 1, ..., ndim - 1]和任意数目的'x'的组合：
则：
- .dimshuffle('x')：将标量变成1维数组
- .dimshuffle(0, 1)：与原始的2维数组相同
- .dimshuffle(1, 0)：交换2维数组的两个维度，形状从N * M 变成M * N
- .dimshuffle('x', 0)：形状从N变成1 * N
- .dimshuffle(0, 'x')：形状从N变成N * 1
- .dimshuffle(2, 0, 1)：形状从A * B * C变成C * A * B
.dimshuffle(0, 'x', 1)：形状从A * B变成A * 1 * B
.dimshuffle(1, 'x', 0)：形状从A * B变成B * 1 * A
.dimshuffle(1,)：将第0维去掉，除去的维度大小必须为1。形状从1 * A变成A
2.theano.tnesor.concatenate:拼接

import theano
import numpy as np
import theano.tensor as T
ones = theano.shared(np.float32([[1, 2, 3], [4, 5, 6],[7, 8, 9]]))
print(ones.get_value())
--->>[[1, 2, 3]
      [4, 5, 6]
      [7, 8, 9]]
result = T.concatenate([ones,ones], axis=0)
print(result.eval())
--->>
[[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]]
result = T.concatenate([ones, ones], axis=1)
print(result.eval())
--->>
[[ 1.  2.  3.  1.  2.  3.]
 [ 4.  5.  6.  4.  5.  6.]
 [ 7.  8.  9.  7.  8.  9.]]

当操作数为二维数组时，axis=0为第一维的方向，axis=1为第二维的方向。
3.theano.tensor.dot(a, b, axes):矩阵乘法

import theano
import numpy as np
import theano.tensor as T
ones = theano.shared(np.float32([[1, 2, 3],[4, 5, 6], [7, 8, 9]]))
print(ones.get_value())
--->>
[[ 1.  2.  3.]
 [ 4.  5.  6.]
 [ 7.  8.  9.]]
result = T.dot(ones, ones)
print(result.eval())
--->>
[[  30.   36.   42.]
 [  66.   81.   96.]
 [ 102.  126.  150.]]

4.theano.gradient.grad_clip(x,lower_bound,upper_bound)：梯度裁剪
其中x是想要进行梯度裁剪的输入，lower_bround,upper_bound表示梯度的上限和下限。
5.theano.tensor.le(a,b)：返回a,b中较小的值a<=b
6.theano.tensor.gt(a,b)：返回逻辑上大于的值，表示为'int8'的tensor, 也可以用来表示语法a>b
7.theano.tensor.lt(a,b)：返回逻辑上较小的值，表示为一个'int8'的tensor，也可以用来表示语法a < b

2018-05-24

1.theano.tensor.switch(cond,ift,iff)：满足条件(cond)输出x, 不满足输出y.
2.theano.scan(fn, sequences=None, outputs_info=None, non_sequences=None, n_steps=None, truncate_gradient=-1, go_backwards=False, mode=None, name=None, profile=False, allow_gc=None, strict=False)

fn是一个lambda或者def函数，描述了一步scan操作的运算式，运算式是的输入参数按照sequences,outputs_info,non_sequences的顺序，运算式的输出作为theano.scan的返回值。
sequences: sequences是一个theano variables或者dictionaries的列表。字典对象的结构为{‘variable’：taps}，其中taps是一个整数列表。’sequences’列表中的所有Theano variable会被自动封装成一个字典，此时taps被设置成[0]。比如sequences = [ dict(input= Sequence1, taps = [-3,2,-1]), Sequence2， dict(input = Sequence3, taps = 3) ]，映射到scan输入参数为Sequence1[t-3]，Sequence1[t+2]，Sequence1[t-1]，Sequence2[t]，Sequence3[t+3]。还有一点就是，如果序列的长度不一致，scan会裁剪成它们中最短的，这个性质方便我们传递一个很长的arange，比如sequences=[coefficients, theano.tensor.arange(max_coefficients_supported)]。
outputs_info：outputs_info是一个theano variables或者dictionaries的列表，它描述了输出的初始状态，显然应该和输出有相同的shape，而且，每进行一步scan操作，outputs_info中的数值会被上一次迭代的输出值更新掉。当然，如果当前循环结构不需要recursive，而仅仅是一个map操作的话，这个参数便可以省略；
non_sequences：non_sequences 是一个‘常量’参数列表，这里所谓的‘常量’是相对于‘outputs_info’中的参数更新而言的，代表了一步scan操作中不会被更新的变量。计算图中的有些变量在这里也可以不显式的指明，但显式指明变量参数会得到一个简化的计算图，加速编译器对图的优化和执行。常见的应用是，把shared variables作为non_sequences参数中的值.
n_steps：n_steps参数是一个int或者theano scalar，代表了scan操作的迭代次数。如果存在输入序列，其中的元素个数小于n_steps，scan函数会报错。如果n_steps参数未指定，scan会根据他的输入参数自动计算出迭代步数；
truncate_gradient：truncate_gradient参数代表了使用BPTT（back propagation through time）算法时，“梯度截断”后的步数。“梯度截断”的目的是在可接受的误差范围内，降低梯度的计算复杂度。常见的应用场景是RNN（recurrent neural network；
strict：strict是一个shared variable校验标志，用于检验是否fn函数用到的所有shared variabes都在non_sequences中，若不满足则会Raise an error。
返回参数：
形如（outputs, updates）格式的元组类型。outputs是一个theano变量，或者多个theano变量构成的list。并且，每一个theano变量包含了所有迭代步骤的输出结果。updates是形如（var, expression）的字典结构，指明了scan中用到的所有shared variables的更新规则。
例如，给定K，想要计算a**k，利用python可以写成：

result = 1
for i in range(k):
    result = result * A

使用theano.scan():
在此我们有三件事需要处理：第一，分配初始值result，第二是result值得计算，第三就是unchanging变量A，不变的变量传递给no_sequences。

import theano
import theano.tensor as T

k = T.iscalar("k")
A = T.vector("A")

# Symbolic description of the result
result, updates = theano.scan(fn=lambda prior_result, A: prior_result * A,
                              outputs_info=T.ones_like(A),
                              non_sequences=A,
                              n_steps=k)
# we only care about A**k, but scan has provided us with A**1 through A**k,
# Discard the values that we don't care about. Scan is smart enough to
# notice this and not waste memory saving them.
final_result = result[-1]
# compiled function that returns A**k
power = theano.function(inputs=[A, k], outputs=final_result, updates=updates)

print(power(range(3), 2))
#===>[ 0.  1.  4.]

2018-06-11

1.theano.tensor常用的数据类型

学习theano，首先要学的就是theano.tensor使用，其实基础数据结构，功能类似于python.numpy。在theano.tensor数据类型中，有double、int、uchar、float等各种类型，不过我们最常用的地int和float类型，float是因为GPU一般是float32类型，所以在编写程序的时候，我们很少用到double，常用的数据类型如下：
数值：
iscalar(int类型的变量)、fscalar(float类型的变量)
一维向量:ivector(int 类型的向量)，fvector(float类型的向量）
二维矩阵:fmatrix(float类型的矩阵)，imatrix(int类型的矩阵)
三维float类型矩阵:ftensor3
四维float类型矩阵:ftensor4
其他类型只要首字母变一下就可以了。
2.theano编程风格
在以前的编程方法中，我们一般先为自变量赋值，然后在把这个自变量作为函数的输入，进行计算因变量。然而在theano中，我们一般是先声明自变量x(不需要赋值)，然后编写函数方程结束后，最后为自变量赋值，计算出函数的输出值y，比如我们要计算“2的3次方”，一般写成下面的格式：

import theano
x = theano.tensor.iscalar('x')# 声明一个int类型的变量x
y = theano.tensor.pow(x,3)# 定义y=x^3
f = theano.function([x],y)# 定义函数的自变量为x(输入),因变量为y(输出)
print(f(2))# 计算当x=2的时候，函数f(x)的值

example:

import theano
import theano.tensor as T

a = T.matrix()
b = T.matrix()
e = T.fscalar()
c = a * b
d = T.dot(a, b)
g = T.ivector()
f = g * e
f1 = theano.function([a, b], c)
f2 = theano.function([a, b], d)
f3 = theano.function([g, e], f)
A = [[1, 2], [3, 4]]# 2*2的矩阵
B = [[2, 4], [6, 8]]# 2*2的矩阵
C = [[1, 2], [3, 4], [5, 6]]# 3*2的矩阵
G = [1, 2, 3, 4]
print(f1(A, B))
print(f2(A, B))
# print(f1(C, B))
print(f2(C, B))
print(f3(G, 0.5))

输出为：

[[  2.   8.]
 [ 18.  32.]]
[[ 14.  20.]
 [ 30.  44.]]
[[ 14.  20.]
 [ 30.  44.]
 [ 46.  68.]]
[ 0.5  1.   1.5  2. ]

注意：函数输入必须是List带[ ]

theano篇

theano

2018-05-24

2018-06-11

你可能感兴趣的:(theano篇)