Relative fisher information and natural gradient for learning large modular models

Ke Sun, Frank Nielsen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

Fisher information and natural gradient provided deep insights and powerful tools to artificial neural networks. However related analysis becomes more and more difficult as the learner's structure turns large and complex. This paper makes a preliminary step towards a new direction. We extract a local component from a large neural system, and define its relative Fisher information metric that describes accurately this small component, and is invariant to the other parts of the system. This concept is important because the geometry structure is much simplified and it can be easily applied to guide the learning of neural networks. We provide an analysis on a list of commonly used components, and demonstrate how to use this concept to further improve optimization.
Original languageEnglish (US)
Title of host publication34th International Conference on Machine Learning, ICML 2017
PublisherInternational Machine Learning Society (IMLS)[email protected]
Pages5058-5079
Number of pages22
ISBN (Print)9781510855144
StatePublished - Jan 1 2017

Bibliographical note

KAUST Repository Item: Exported on 2020-12-30
Acknowledgements: The authors would like to thank the anonymous reviewers and Yann Ollivier for the helpful comments. This work was mainly conducted when the first author was a postdoctoral researcher at Ecole Polytechnique.

Fingerprint

Dive into the research topics of 'Relative fisher information and natural gradient for learning large modular models'. Together they form a unique fingerprint.

Cite this