Source Themes | Ravid Shwartz Ziv

What Do We Maximize in Self-Supervised Learning?

We examine self-supervised learning methods to provide an information-theoretical understanding of their construction. As a first step, we demonstrate how information-theoretic quantities can be obtained for a deterministic network. This enables us to demonstrate how SSL methods can be (re)discovered from first principles and thier assumptions about the data distribution. Furthermore, we empirically demonstrate the validity of our assumptions, confirming our novel understanding.

Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors

We show that we can learn highly informative posteriors from the source task, through supervised or self-supervised approaches, which then serve as the basis for priors that modify the whole loss surface on the downstream task. This approach enables significant performance gains and more data-efficient learning on a variety of downstream classification and segmentation tasks

Tabular Data: Deep Learning is Not All You Need

We explored whether deep models should be a recommended option for tabular data by rigorously comparing the new deep models to XGBoost on various datasets. Our study shows that XGBoost outperforms these deep models across the datasets, including the datasets used in the papers that proposed the deep models. We also show that an ensemble of deep models and XGBoost performs better on these datasets than XGBoost alone.

The Dual Information Bottleneck

A new framework, which resolves some of the known drawbacks of the Information Bottleneck. We provide a theoretical analysis of the framework, finding the structure of its solutions and present a novel variational formulation for DNNs.

Information in Infinite Ensembles of Infinitely-Wide Neural Networks

Study the generalization properties of infinite ensembles of infinitely-wide neural networks. We report analytical and empirical investigations in the search for signals that correlate with generalization.

Neural Correlates of Learning Pure Tones or Natural Sounds in the Auditory Cortex

Analysing perceptual learning of pure tones in the auditory cortex. Using a novel computational model, we show that overrepresentation of the learned tones does not improve along the training.

Attentioned Convolutional LSTM Inpaintingv Network for Anomaly Detection in Videos

A semi supervised model for detecting anomalies in videos inspiredby the Video Pixel Network. We extend the Convolutional LSTM video encoder of the VPN with a novel convolutional based attention. This approach could be a component in applications requiring visual common sense.

Sequence Modeling Using a Memory Controller Extension for LSTM

We extend the standard LSTM architecture by augmenting it with an additional gate which produces a memory control vector signal. This vector is fed back to the LSTM instead of the original output prediction. By decoupling the LSTM prediction from its role as a memory controller we allow each output to specialize in its own task.

Opening the Black Box of Deep Neural Networks via Information

We demonstrate the effectiveness of the Information-Plane visualization of DNNs. (i) Most of the training epochs are spent on compression of the input to efficient representation. (ii) The representation compression phase begins when the SGD steps change from a fast drift into a stochastic relaxation (iii) The converged layers lie very close to the information bottleneck theoretical bound, and the maps to the hidden layers satisfy the IB self-consistent equations (iv) The training time is dramatically reduced when adding more hidden layers.