This post is about the PredNet and CortexNet papers. I worked on them briefly while beginning to learn pytorch and am now trying to use them in my Mini-P (Video frame interpolation)

CortexNet

Motivation

We learn a lot in unsupervised setting. Babies learn to track faces. We learn to predict and visualise. Our visual system is robust to perturbations. Inspired by human visual cortex, the failry generic architecture models top-down (feedforward), bottom-up (feedback) and lateral (temporal) connections.

Interesting Claims: The model learns to (1) compensate for camera egomotion, (2) predict object trajectory and (3) give attention to one object at a time. (However I don’t find explicit experiments demonstrating the same.. at least point 3 should be.)

Model architecture and losses

[TODO] However, basically see Figure 1 and read its description first. Then read Section 3. Then see figure 2. Then read its description. Then read the text below it.

Experiments

Predicting future frames
Video Classification

Results

I don’t see proper comparision and results etc. But a friend has said that this worked for him above all other approaches in a related project on video stylization.

Code

Amazing code is available at their github repo

Definitely go through it if you are learning pytorch.

PredNet

Relation to CortexNet

Ladder sort of network
Predictive Coding
Predecessor to CortexNet

Differences:

Uses ConvLSTM
Much better experiments and results.

Predictive Coding

Review of PredNet and CortexNet papers

Recently by the same author:

Image Classification Networks

Review of latest Imagenet Classfication papers.

Karan Dwivedi

Computer Science Student, CV/DL Enthusiast

You may find interesting:

Unsupervised Domain Adaptation

Unsupervised domain adaptation can be easily extended to semi supervised / supervised approaches.

Graph Neural Networks (Defintions)

Different definitions of Graph Neural Networks

Also read: