I’m a CDS Faculty Fellow at the NYU Center for Data Science, where I mostly work with Andrew Gordon Wilson and Yann Lecun on Bayesian deep networks, information theory, and self-supervised learning.
I completed my Ph.D. under the supervision of Prof Naftali Tishby and Prof Haim Sompolinsky at the Hebrew University of Jerusalem. In my Ph.D., I focused on the connection between deep neural networks (DNNs) and information theory. I tried to develop a deeper understating of DNNs based on information theory and to implement it over large scale problems. I recived the Google PhD Fellowship.
In parallel, I work as a researcher at the A.I. & data science research team of Intel’s Advanced Analytics group. There, I am involved in several projects. Mainly, development of deep learning, computer vision, and sensory data solutions for healthcare, manufacturing, and marketing, for both internal and external uses.
In 2019-2020 I had the opportunity to work as a research student at Google Brain, CA, USA. In this position, I explored the generalization ability of DNNs using information theory tools.
In the past, I was also involved in developing some projects for Wikipedia.
In my free, I volunteer as a developer at The Public Knowledge Workshop.
And I love basketball :)
PhD in Computer Science and Neuroscience, 2021
The Hebrew University of Jerusalem
MSc in Computer Science and Neuroscience, 2016
The Hebrew University of Jerusalem
BSc in Computer Science and Bioinformatics, 2014
The Hebrew University of Jerusalem
Empirical and theoretical study of DNNs based on information-theoretical principles.
Developing novel deep learning, computer vision and sensory data solutions for healthcare, manufacturing, sales, and marketing for both internal and external usage. Selected Projects
We explored whether deep models should be a recommended option for tabular data by rigorously comparing the new deep models to XGBoost on various datasets. Our study shows that XGBoost outperforms these deep models across the datasets, including the datasets used in the papers that proposed the deep models. We also show that an ensemble of deep models and XGBoost performs better on these datasets than XGBoost alone.
A new framework, which resolves some of the known drawbacks of the Information Bottleneck. We provide a theoretical analysis of the framework, finding the structure of its solutions and present a novel variational formulation for DNNs.
Study the generalization properties of infinite ensembles of infinitely-wide neural networks. We report analytical and empirical investigations in the search for signals that correlate with generalization.
A semi supervised model for detecting anomalies in videos inspiredby the Video Pixel Network. We extend the Convolutional LSTM video encoder of the VPN with a novel convolutional based attention. This approach could be a component in applications requiring visual common sense.
We extend the standard LSTM architecture by augmenting it with an additional gate which produces a memory control vector signal. This vector is fed back to the LSTM instead of the original output prediction. By decoupling the LSTM prediction from its role as a memory controller we allow each output to specialize in its own task.
We demonstrate the effectiveness of the Information-Plane visualization of DNNs. (i) Most of the training epochs are spent on compression of the input to efficient representation. (ii) The representation compression phase begins when the SGD steps change from a fast drift into a stochastic relaxation (iii) The converged layers lie very close to the information bottleneck theoretical bound, and the maps to the hidden layers satisfy the IB self-consistent equations (iv) The training time is dramatically reduced when adding more hidden layers.
We explored whether deep models should be a recommended option for tabular data by rigorously comparing the new deep models to XGBoost on various datasets. Our study shows that XGBoost outperforms these deep models across the datasets, including the datasets used in the papers that proposed the deep models. We also show that an ensemble of deep models and XGBoost performs better on these datasets than XGBoost alone.