Importance Sampling

Importance sampling focuses the computation to informative/important samples

Posted by Karan Dwivedi on August 7, 2017

Recently by the same author:


Image Classification Networks

Review of latest Imagenet Classfication papers.


Karan Dwivedi

Computer Science Student, CV/DL Enthusiast

You may find interesting:


Unsupervised Domain Adaptation

Unsupervised domain adaptation can be easily extended to semi supervised / supervised approaches.


Graph Neural Networks (Defintions)

Different definitions of Graph Neural Networks

This post is about the idea of Importance Sampling. I will review the paper: Biased Importance Sampling for Deep Neural Network Training

What is Importance Sampling?

When training a model, it is obvious that not all samples are equally important; many of them are properly handled after a few epochs of training, and most could be ignored at that point without impacting the resulting final model.

Biased Importance Sampling for Deep Neural Network Training

In summary, the contributions of this work are:

  • The use of the loss instead of the gradient norm to estimate the importance of a sample
  • The creation of a model able to approximate the loss for a low computational overhead
  • The development of an online algorithm that minimizes a soft max-loss in the training set through importance sampling.
  • They also show how this method leads to better generalization.

Questions

  1. What about catastrophic forgetting? Unimportant samples might become important later.

  2. How to use this for self supervision in presence of large amount of data where we may not have access to complete training data at the same time?

  3. Keeping a separate model and using loss on that seems somewhat redundant. Why not get some signal from the same model?

They have also open sourced their code on github (related reddit thread)