User:Jimmy Novik/Nonlinear Mapping Function
In order to map x to linear models, we need to perform non-linear mapping on the inputs.
Options
[edit]1. Choose a very generic φ, like an infinite-dimensional that is implicitly used by kernel machines based on the RBF kernel. (Not good for advanced problems)
2. Manually engineer φ.
3. Use deep learning to learn φ. y=f(x;θ, w) =φ(x;θ)Tw
Example: Learning (approximating) XOR function
[edit]Goal Function
If we choose a linear model with and , then the model is defined as , which will output and when solved. This isn't very useful to have a function that outputs a constant value over all of the inputs.
If we add a hidden layer, such that and with the complete model being
There must be a non-linear function to be able to turn our linear model into a non-linear. This is usually done by affine transformation followed by a fixed nonlinear activation function.
, where W provides the weight of a linear transformation and c the biases.
The default recommended activation function is rectified linear unit, or ReLU, defined as
The complete network can now be specified as