Representation Compression in Deep Neural Network

Name: Representation Compression in Deep Neural Network
Start: 2018-08-29T00:00:00Z
Location: Google PhD Fellowship Summit, Mountain View, CA, USA

Image credit: Unsplash

Abstract

Understanding the groundbreaking performance of Deep Neural Networks is one of the greatest challenges to the scientific community today. In this work, we introduce an information theoretic viewpoint on the behavior of deep networks optimization processes and their generalization abilities. By studying the Information Plane, the plane of the mutual information between the input variable and the desired label, for each hidden layer. Specifically, we show that the training of the network is characterized by a rapid increase in the mutual information (MI) between the layers and the target label, followed by a longer decrease in the MI between the layers and the input variable. Further, we explicitly show that these two fundamental information-theoretic quantities correspond to the generalization error of the network, as a result of introducing a new generalization bound that is exponential in the representation compression. The analysis focuses on typical patterns of large-scale problems. For this purpose, we introduce a novel analytic bound on the mutual information between consecutive layers in the network. An important consequence of our analysis is a super-linear boost in training time with the number of non-degenerate hidden layers, demonstrating the computational benefit of the hidden layers.

Date

Aug 29, 2018 12:00 AM

Event

Google PhD Fellowship Summit

Location

Google PhD Fellowship Summit, Mountain View, CA, USA

Mountain View, CA,

Representation Compression in Deep Neural Network

Abstract

Ravid Shwartz-Ziv