# Theano¶

## Tutorial¶

### Adding two Scalars¶

```
import numpy
import theano.tensor import T
from theano import functin
x = T.dscalar('x')
y = T.dscalar('y')
z = x + y
f = function([x,y],z)
f(1,2)
numpy.allclose(z.eval({x:1, y:2}), 3)
```

- numpy.allclose
- Returns True if two arrays are element-wise equal within a tolerance.

QUESTION: tensor 张量? numpy.allclose?

### Adding two Matrices¶

```
x = T.dmatrix('x')
y = T.dmatrix('y')
z = x + y
f = function([x,y],z)
f([[1,2], [3,4]], [[10,20], [30,40]])
f(np.array([[1,2], [3,4]]), np.array([[10,20], [30,40]]))
```

### Logistic Function¶

```
x = T.dmatrix('x')
s = 1 / (1 + T.exp(-x))
s2 = (1 + T.tanh(x / 2)) / 2
```

The reason logistic is performed elementwise is because all of its operations—division, addition, exponentiation, and division—are themselves elementwise operations.

s 等价于 s2

QUESTION: e 是什么？ QUESTION: T.dmatrix 一定是二维的？

### Setting a Default Value for an Argument¶

This makes use of the In class which allows you to specify properties of your function’s parameters with greater detail.

```
from theano import In
x, y = T.dscalars('x', 'y')
z = x + y
f = function([x, In(y, value=1)], z)
f(33)
```

### Copy functions¶

We can use copy() to create a similar accumulator but with its own internal state using the swap parameter, which is a dictionary of shared variables to exchange:

```
import theano
import theano.tensor as T
state = theano.shared(0)
inc = T.iscalar('inc')
accumulator = theano.function([inc], state, updates=[(state, state+inc)], on_unused_input='warn')
new_state = theano.shared(0)
new_accumulator = accumulator.copy(swap={state:new_state})
null_accumulator = accumulator.copy(delete_updates=True)
```

因为最后一个 `delete_updates=True`

会导致参数 `inc`

无用，所以 `on_unused_input='warn'`

时才可以 copy 成功。

### Using Random Numbers¶

Theano will allocate a NumPy RandomStream object (a random number generator) for each such variable, and draw from it as necessary. We will call this sort of sequence of random numbers a random stream.

#### Brief Example¶

```
from theono.tensor.shared_randomstreams import RandomStreams
from theano import function
srng = RandomStreams(seed=234)
rv_u = srng.uniform((2,2))
rv_n = srng.normal((2,2))
f = function([], rv_u)
g = function([], rv_n, no_default_updates=True)
nearly_zeros = function([], rv_u + rv_u - 2 * rv_u)
```

`rv_u`

draws from a uniform distribution, and `rv_n`

drom a normal distribute.

When we add the extra argument no_default_updates=True to function (as in g), then the random number generator state is not affected by calling the returned function. So, for example, calling g multiple times will return the same numbers.

An important remark is that a random variable is drawn at most once during any single function execution. So the nearly_zeros function is guaranteed to return approximately 0 (except for rounding error) even though the rv_u random variable appears three times in the output expression.

### Sharing Streams Between Functions¶

```
state_after_v0 = rv_u.rng.get_value().get_state()
nearly_zeros() # this affects rv_n's generator
v1 = f()
rng = rv_u.rng.get_value(borrow=True)
rng.set_state(state_after_v0)
rv_n.rng.set_value(rng, borrow=True)
v2 = f()
v3 = f() # v3 == v1
```

原来 random stream 的 state 是可以保存重放的。

```
import numpy
import theano
import theano.tensor as T
rng = numpy.random
N = 400 # training sample size
feats = 784 # number of input variables
# generate a dataset: D = (input_values, target_class)
D = (rng.randn(N, feats), rng.randint(size=N, low=0, high=2))
training_steps = 10000
# Declare Theano symbolic variables
x = T.dmatrix("x")
y = T.dvector("y")
# initialize the weight vector w randomly
#
# this and the following bias variable b
# are shared so they keep their values
# between training iterations (updates)
w = theano.shared(rng.randn(feats), name="w")
# initialize the bias term
b = theano.shared(0., name="b")
print("Initial model:")
print(w.get_value())
print(b.get_value())
# Construct Theano expression graph
p_1 = 1 / (1 + T.exp(-T.dot(x, w) - b)) # Probability that target = 1
prediction = p_1 > 0.5 # The prediction thresholded
xent = -y * T.log(p_1) - (1-y) * T.log(1-p_1) # Cross-entropy loss function
cost = xent.mean() + 0.01 * (w ** 2).sum()# The cost to minimize
gw, gb = T.grad(cost, [w, b]) # Compute the gradient of the cost
# w.r.t weight vector w and
# bias term b
# (we shall return to this in a
# following section of this tutorial)
# Compile
train = theano.function(
inputs=[x,y],
outputs=[prediction, xent],
updates=((w, w - 0.1 * gw), (b, b - 0.1 * gb)))
predict = theano.function(inputs=[x], outputs=prediction)
# Train
for i in range(training_steps):
pred, err = train(D[0], D[1])
print("Final model:")
print(w.get_value())
print(b.get_value())
print("target values for D:")
print(D[1])
print("prediction on D:")
print(predict(D[0]))
```

QUESTION: 步骤看不懂。难得有一份代码。快看。 QUESTION: 不太懂 bias term