Infomax

Infomax, or the principle of maximum information preservation, is an optimization principle for artificial neural networks and other information processing systems. It prescribes that a function that maps a set of input values $x$ to a set of output values $z(x)$ should be chosen or learned so as to maximize the average Shannon mutual information between $x$ and $z(x)$ , subject to a set of specified constraints and/or noise processes. Infomax algorithms are learning algorithms that perform this optimization process. The principle was described by Linsker in 1988.^[1] The objective function is called the InfoMax objective.

As the InfoMax objective is difficult to compute exactly, a related notion uses two models giving two outputs $z_{1}(x),z_{2}(x)$ , and maximizes the mutual information between these. This contrastive InfoMax objective is a lower bound to the InfoMax objective.^[2]

Infomax, in its zero-noise limit, is related to the principle of redundancy reduction proposed for biological sensory processing by Horace Barlow in 1961,^[3] and applied quantitatively to retinal processing by Atick and Redlich.^[4]

Applications

(Becker and Hinton, 1992)^[2] showed that the contrastive InfoMax objective allows a neural network to learn to identify surfaces in random dot stereograms (in one dimension).

One of the applications of infomax has been to an independent component analysis algorithm that finds independent signals by maximizing entropy. Infomax-based ICA was described by (Bell and Sejnowski, 1995),^[5] and (Nadal and Parga, 1995).^[6]

References

^ Linsker R (1988). "Self-organization in a perceptual network". IEEE Computer. 21 (3): 105–17. doi:10.1109/2.36. S2CID 1527671.
^ ^a ^b Becker, Suzanna; Hinton, Geoffrey E. (January 1992). "Self-organizing neural network that discovers surfaces in random-dot stereograms". Nature. 355 (6356): 161–163. doi:10.1038/355161a0. ISSN 1476-4687. PMID 1729650.
^ Barlow, H. (1961). "Possible principles underlying the transformations of sensory messages". In Rosenblith, W. (ed.). Sensory Communication. Cambridge MA: MIT Press. pp. 217–234.
^ Atick JJ, Redlich AN (1992). "What does the retina know about natural scenes?". Neural Computation. 4 (2): 196–210. doi:10.1162/neco.1992.4.2.196. S2CID 17515861.
^ Bell AJ, Sejnowski TJ (November 1995). "An information-maximization approach to blind separation and blind deconvolution". Neural Comput. 7 (6): 1129–59. CiteSeerX 10.1.1.36.6605. doi:10.1162/neco.1995.7.6.1129. PMID 7584893. S2CID 1701422.
^ Nadal J.P., Parga N. (1999). "Sensory coding: information maximization and redundancy reduction". In Burdet, G.; Combe, P.; Parodi, O. (eds.). Neural Information Processing. World Scientific Series in Mathematical Biology and Medicine. Vol. 7. Singapore: World Scientific. pp. 164–171.

Bell AJ, Sejnowski TJ (December 1997). "The "Independent Components" of Natural Scenes are Edge Filters". Vision Res. 37 (23): 3327–38. doi:10.1016/S0042-6989(97)00121-1. PMC 2882863. PMID 9425547.
Linsker R (1997). "A local learning rule that enables information maximization for arbitrary input distributions". Neural Computation. 9 (8): 1661–65. doi:10.1162/neco.1997.9.8.1661. S2CID 42857188.
Stone, J. V. (2004). Independent Component Analysis: A tutorial introduction. Cambridge MA: MIT Press. ISBN 978-0-262-69315-8.

This applied mathematics–related article is a stub. You can help Wikipedia by expanding it.

[1] Linsker R (1988). "Self-organization in a perceptual network". IEEE Computer. 21 (3): 105–17. doi:10.1109/2.36. S2CID 1527671.

[:0-2] Becker, Suzanna; Hinton, Geoffrey E. (January 1992). "Self-organizing neural network that discovers surfaces in random-dot stereograms". Nature. 355 (6356): 161–163. doi:10.1038/355161a0. ISSN 1476-4687. PMID 1729650.

[3] Barlow, H. (1961). "Possible principles underlying the transformations of sensory messages". In Rosenblith, W. (ed.). Sensory Communication. Cambridge MA: MIT Press. pp. 217–234.

[4] Atick JJ, Redlich AN (1992). "What does the retina know about natural scenes?". Neural Computation. 4 (2): 196–210. doi:10.1162/neco.1992.4.2.196. S2CID 17515861.

[5] Bell AJ, Sejnowski TJ (November 1995). "An information-maximization approach to blind separation and blind deconvolution". Neural Comput. 7 (6): 1129–59. CiteSeerX 10.1.1.36.6605. doi:10.1162/neco.1995.7.6.1129. PMID 7584893. S2CID 1701422.

[6] Nadal J.P., Parga N. (1999). "Sensory coding: information maximization and redundancy reduction". In Burdet, G.; Combe, P.; Parodi, O. (eds.). Neural Information Processing. World Scientific Series in Mathematical Biology and Medicine. Vol. 7. Singapore: World Scientific. pp. 164–171.

[1]

[2]

[3]

[4]

[5]

[6]

Applications

See also

References