profile photo

Nirat Saini

I am a PhD Student in the department of Computer Science at University of Maryland (UMD), College Park. I work on problems in computer vision with Prof. Abhinav Shrivastava. Previously, I pursued integrated Bachelors-Masters degree from Dayalbagh Educational Institute, Agra, India.

In the past, I've been fortunate to have worked with Mainak Chatterjee, Huzur Saran, Hongchang Wang, Taniya Misra, Pallabi Ghosh, Varun Manjunatha, Navaneeth Bodla, Xiao Zhang, Bharat Singh, and many awesome student collaborators!

Email  /  CV  /  Google Scholar  /  Github  /  LinkedIn

I'm actively seeking job opportunities. Please get in touch if you believe I can be a good fit for your team.

Research Interests

I work in the intersection of computer vision, natural language processing and cognition. This primarily consists of developing algorithms based on how humans understand concepts and perceive the world. Humans have ability to be creative; understand and apply concepts from different domains. For machines to be generalizable, I study Compositional Zero-Shot Learning (CZSL), such that with limited data, computers learn to compose different concepts which are unseen. I'm interested in Recognition (Video Understanding) and Generation of fine-grained unseen compositions, for editing and generaing new image and video (Gen AI).

InVi: Object Insertion in Videos Using Off-the-Shelf Diffusion Models
Nirat Saini, Navaneeth Bodla, Ashish Shrivastava, Avinash Ravichandran, Xiao Zhang,
Abhinav Shrivastava, Bharat Singh
Under Submission.

TL;DR Utilize pre-trained Text-to-Image Diffusion models to achieve video in-painting, (inserting a new object in a video), eliminates the need for training a video generation model.
Paper coming soon.

Beyond Seen Primitive Concepts for Attribute-Object Compositional Learning
Nirat Saini, Khoi Pham, Abhinav Shrivastava
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

TL;DR Open-Vocabulary Compositional Zero-shot Learning (OV-CZSL) enables learning of entirely new attributes, objects, and their compositions. Model can expand beyond their seen vocabulary.
Project / Paper

WayEx: Waypoint Exploration using a Single demonstration
Mara Levy, Nirat Saini, Abhinav Shrivastava
IEEE International Conference on Robotics and Automation (ICRA) 2024.

TL;DR WayEx: a novel method for learning complex goal-conditioned robotics tasks from a single demonstration, using unique reward function and knowledge expansion.
Project / Paper

Chop & Learn: Recognizing and Generating Object-State Compositions
Nirat Saini*, Hanyu Wang*, Archana Swaminathan, Vinoj Jayasundara, Bo He, Kamal Gupta,
Abhinav Shrivastava
International Conference on Computer Vision (ICCV), 2023.

TL;DR Benchmark suite for fruits, vegetables and various cutting styles from multiple views. Compositional Image Generation supports generating unseen cutting styles of different objects.
Project / Paper / Media Blogs: (UMD News, TechXplore, MARKTECHPost )

Disentangling Visual Embeddings for Attributes and Objects
Nirat Saini, Khoi Pham, Abhinav Shrivastava
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.   (Oral Presentation)

TL;DR Compositional Zero-shot Learning (CZSL) for attributes and objects is solved by disentangling concepts in the visual feature space, and using those for hallucinating novel complex concepts.
Project / Paper / Code

Recognizing Actions using Object States
Nirat Saini, Bo He, Gaurav Shrivastava, Saketh Rambhatla, Abhinav Shrivastava
Workshop on the Elements of Reasoning: Objects, Structure and Causality at ICLR, 2022.

TL;DR Using initial and final states of objects (two frames), learns to classify the actions which is being performed in the video that is causing the object state change.
Project / Paper

Learning Graphs for Knowledge Transfer with Limited Labels
Pallabi Ghosh, Nirat Saini, Larry Davis, Abhinav Shrivastava
Conference on Computer Vision and Pattern Recognition (CVPR) , 2021.

TL;DR Dynamically updates adjacency matrices for Graphs used in Graph Convolutional Networks (GCN), can be used for Semi-supervised learning and zero/few-shot action recognition tasks.
Project / Paper

All about knowledge graphs for actions
Pallabi Ghosh, Nirat Saini, Larry Davis, Abhinav Shrivastava
ArXiv , 2020.

TL;DR Analyze different types of Knowldge Graphs (KGs): action embeddings, action-object embeddings, and visual embeddings, for Zero/Few-shot Action Recognition task.
Paper

Explicit Bias Discovery in Visual Question Answering Models
Varun Manjunatha, Nirat Saini, Larry Davis,
Conference on Computer Vision and Pattern Recognition (CVPR) , 2019

TL;DR Analyze the tendency of Visual Question Answering (VQA) models to rely on statistical biases in the dataset to answer questions, rather than paying attention to the visual content.
Paper

Multi-bid spectrum auctions in dynamic spectrum access networks with spatial reuse
Nirat Saini, Mainak Chatterjee,
International Conference on Communication Systems and Networks (COMSNETS) , 2017.

TL;DR Design an auction-based spectrum allocation framework designed for transaction between primary users adding secondary users, while maximizing the spatial reuse capacity.
Paper


© 2022 Nirat Saini. Template credits Jon Barron.