User:Esraiak/sandbox

Simple example

A convolutional neural net is a feed-forward neural net that has convolutional layers.

A convolutional layer takes as input a M x N matrix (could for example be the redness of each pixel of an input image) and provides as output another M' x N' matrix. Oftentimes convolutional layers are placed in R parallel channels, and such stacks of convolutional layers are sometimes also called (M x N x R) convolutional layers. For clarity, let's continue with the R=1 case.

Suppose we want to train a network to recognize features from a 13x13 pixel grayscale image (so the image is a real 13x13 matrix). It is reasonable to create a first layer with neurons that connect to small connected patches, since we expect these neurons to learn to recognize "local features" (like lines or blots).

Viterbi

Given a sequence $(y_{1},\dots ,y_{r})$ of observations, emission probabilities p(x,y) of observing $y$ when the hidden state is $x$ , and transition probabilities q(x, x') between hidden states, find the most likely path $(x_{1},\dots ,x_{r})$ of hidden states.

The algorithm

Extended content

Let $N$ be a neural network with $e$ edges.

Below, $x,x_{1},x_{2},\dots$ will denote vectors in $\mathbb {R} ^{m}$ , $y,y',y_{1},y_{2},\dots$ vectors in $\mathbb {R} ^{n}$ , and $w,w_{0},w_{1},\ldots$ vectors in $\mathbb {R} ^{e}$ . These are called inputs, outputs and weights respectively. The neural network gives a function $y=f_{N}(w,x)$ which, given a weight $w$ , maps an input $x$ to an output $y$ .

We select an error function $E(y,y')$ measuring the difference between two outputs. The standard choice is $E(y,y')=|y-y'|^{2}$ , the Euclidean distance between the vectors $y$ and $y'$ .

The backpropagation algorithm takes as input a sequence of training examples $(x_{1},y_{1}),\dots ,(x_{p},y_{p})$ and produces a sequence of weights $w_{0},w_{1},\dots ,w_{p}$ starting from some initial weight $w_{0}$ , usually chosen to be random. These weights are computed in turn: we compute $w_{i}$ using only $(x_{i},y_{i},w_{i-1})$ for $i=1,\dots ,p$ . The output of the backpropagation algorithm is then $w_{p}$ , giving us a new function $f_{N}(w_{p},\cdot )$ . The computation is the same in each step, so we describe only the case $i=1$ .

Now we describe how to find $w_{1}$ from $(x_{1},y_{1},w_{0})$ . This is done by considering a variable weight $w$ apply gradient descent to the function $w\mapsto E(f_{N}(w,x_{1}),y_{1})$ to find a local minimum, starting at $w_{0}$ . We then let $w_{1}$ be the minimizing weight found by gradient descent.

The algorithm in coordinates

collaps't

We now compute the gradient of the function

w\mapsto E(f_{N}(w,x),y)

from the previous section, for the special choice

E(y,y')=|y-y'|^{2}

.

To compute the gradient, it is enough to find compute ${\frac {\partial z_{i}}{\partial w_{k}}}$ , where $z_{i}$ is the output of node number $i$ in $N$ (whether it is an input, output or hidden neuron). Suppose $z_{i}=\varphi _{i}(\sum _{j}w_{ij}z_{ij})$ , where $w_{ij}$ is the weight of the $j$ th incoming connection to $i$ , and the other end of that connection has output $z_{ij}$ . The choice of $\varphi _{i}(t)$ depends on $N$ ; a common choice is $\varphi _{i}(t)={\frac {1}{1+e^{-t}}}$ for all $i$ .

Then ${\frac {\partial z_{i}}{\partial w_{k}}}={\frac {\partial }{\partial w_{k}}}\varphi _{i}\left(\sum _{j}w_{ij}z_{ij}\right)=\sum _{j}{\frac {\partial }{\partial w_{k}}}\left(w_{ij}z_{ij}\right)\varphi ^{\prime }\left(\sum _{j}w_{ij}z_{ij}\right)=\sum _{j}\left({\frac {\partial w_{ij}}{\partial w_{k}}}z_{ij}+w_{ij}{\frac {\partial z_{ij}}{\partial w_{k}}}\right)\varphi ^{\prime }\left(\sum _{j}w_{ij}z_{ij}\right)$ .

Note that we can apply the same formula again to

{\frac {\partial z_{ij}}{w_{k}}}

and that

{\frac {\partial w_{ij}}{\partial w_{k}}}

is either

0

or

1

. This process must eventually end if there are no cycles in

N

(which we assume).

How does Aspirin relieve pain?

Aspirin contains acetylsalicylic acid (ASA)

ASA enters the blood stream

ASA disables COX
By entangling of the molecules (picture)

Prostaglandins are produced by wounded tissue
White blood cells produce prostaglandins at wounded cells (=tissue)

COX is required for prostaglandin production

Prostaglandins cause pain