Lovish Chum


I am an MS student in Computer Science, part of the Fu Foundation School of Engineering at Columbia University, where I work on self-supervised learning of video representations. I am advised by Carl Vondrick.

From 2017 to 2019, I worked at IIIT Hyderabad, as part of CVIT where I worked with CV Jawahar and Vineeth N Balasubhramanian.

Previously, I was an undergrad at IIT Kanpur working with Aditya Nigam and Phalguni Gupta.

GitHub  /  LinkedIn

profile photo


Research Projects

I'm interested in incorporating a causal structure to the computer vision tasks.

project image

Spatio-temporal Scene Graph Generation (Ongoing)

Lovish Chum, with Xudong Lin and Carl Vondrick
Part of MS thesis, 2020

project image

Encoding Unertainity in Video Representation Learning

Lovish Chum, with Ruoshi Liu and Carl Vondrick
Independent Study, 2020

Can stochasticity of future in videos help learn better representations ? To answer this question, we relax the point-based contrastive loss to a n-shpere based contrastive loss. We build our experiments on top of Dense Predictive Coding, a self-supervised representation technique. Training on Kinetics-400 and fine-tuning on UCF-101, HMDB-51 help us understand that the later two datasets do not contain videos with diverse futures. Hence, we build a block toy dataset to control of stochasticity and evaluate the technique.

project image

Causal Physical Reasoning

Lovish Chum, with Chengzhi Mao
Causal Inference Course, 2020
[Report] [Code]

In this project, we take a step towards learning causal visual features. For this, we have created a simulation environment called Causal-PHYRE. For each task in the dataset, we control the outcome, cause and confounder. We start out by observing if a vanilla NN architecture detect the causal structure of tasks from simulated videos. Further, we simulate interventions in the environment to explicitly aid learning causal structure of underlying tasks. Experiments show that intervention-based learning strategy not only improves the detection of real cause rather than confounder. This concurres with assertion in Pearl Causal Hierarchy (PCH), which says that it is impossible to learn causal structure using just interventional data.

project image

Optional Depth Pathway for Mask R-CNN

Lovish Chum, with Jianjin Xu and Zhaoyang Wang
Robot Learning Course, 2019
[Report] [Code]

Instance segmentation is one of the most important perception tasks in computer vision. We present an approach to optionally use the depth given along an image to aid the performance on this task. We observe that depth information incorporated through Spatially-Adaptive (DE)normalization (SPADE) results in significant improvement on the task on NYUv2 dataset. Additionally, we observe that the use of ODM (Optional Depth Module) helps to prevent the degradation of performance even when the depth data is unavailable to the network.

project image

Beyond Supervised Learning: A Computer Vision Perspective

Lovish Chum, Anbumani Subramanian, Vineeth N. Balasubramanian & C. V. Jawahar
Journal of Indian Institute of Science, 2019

Fully supervised deep learning-based methods have created a profound impact in various fields of computer science. Compared to classical methods, supervised deep learning-based techniques face scalability issues as they require huge amounts of labeled data and, more significantly, are unable to generalize to multiple domains and tasks. In recent years, a lot of research has been targeted towards addressing these issues within the deep learning community. Although there have been extensive surveys on learning paradigms such as semi-supervised and unsupervised learning, there are few timely reviews after the emergence of deep learning. In this paper, we provide an overview of the contemporary literature surrounding alternatives to fully supervised learning in the deep learning context. First, we summarize the relevant techniques that fall between the paradigm of supervised and unsupervised learning. Second, we take autonomous navigation as a running example to explain and compare different models. Finally, we highlight some shortcomings of current methods and suggest future directions.

Earlier Projects

project image

Contact Lens Detection for Iris Recongition

Computer Analysis of Image and Patterns, 2015
[Publication] [Report] [Slides]

Collected a dataset in which users are wearing soft and cosmetic contact lenses. Emperically proved that a person wearing either of those lenses leads to an average degradation of 3.10% in EER when subject is wearing soft lens and 17.34% when subject is wearing cosmetic lens. Further, we propose a cosmetic lens detection approach based on Local Phase Quantization(LPQ) and Binary Gabor Pattern(BGP). Experiments conducted on publicly available IIITD Vista, IIITD Cogent, ND 2010 and self-collected dataset indicate that our method outperforms previous lens detection techniques in terms of Correct Classification Rate and false Acceptance Rate. The results suggest that a comprehensive texture descriptor having blur tolerance of LPQ and robustness of BGP is suitable for cosmetic lens detection.


I have a keen interest in community learning. While at IIIT-H, I actively took initiatives to share what I learn. I totally believe in the following adage :

If you want to learn something,
read about it.
If you want to understand something,
write about it.
If you want to master something,
teach it

project image

Bayesian Machine Learning

CVIT, IIIT Hyderabad

A short course on Bayesian Machine Learning covering EM, Variational Inference and Latent Variable Models in a succinct manner.

project image

Linear Algebra

CVIT, IIIT Hyderabad

A short course on Linear Algebra covering matrix transformations and some form of matrix decompositions.

Thanks Jon Barron for source