Nirat Saini
I am a PhD Student in the department of Computer Science at University of Maryland (UMD), College Park. I work on problems in computer vision with Prof. Abhinav Shrivastava. Previously, I pursued integrated Bachelors-Masters degree from Dayalbagh Educational Institute, Agra, India.
In the past, I've been fortunate to have worked with Mainak Chatterjee,
Huzur Saran,
Hongchang Wang,
Taniya Misra,
Pallabi Ghosh,
Varun Manjunatha,
Navaneeth Bodla,
Xiao Zhang,
Bharat Singh,
and many awesome student collaborators!
Email  / 
CV  / 
Google Scholar  / 
Github  / 
I'm actively seeking job opportunities. Please get in touch if you believe I can be a good fit for your team.
Research Interests
I work in the intersection of computer vision, natural language processing and cognition.
This primarily consists of developing algorithms based on how humans understand concepts and
perceive the world. Humans have ability to be creative; understand and apply concepts from
different domains. For machines to be generalizable, I study Compositional Zero-Shot Learning
(CZSL), such that with limited data, computers learn to compose different concepts which
are unseen. I'm interested in Recognition (Video Understanding) and Generation of fine-grained unseen compositions, for editing
and generaing new image and video (Gen AI).
InVi: Object Insertion in Videos Using Off-the-Shelf Diffusion Models
Nirat Saini,
Navaneeth Bodla,
Ashish Shrivastava,
Avinash Ravichandran,
Xiao Zhang,
Abhinav Shrivastava,
Bharat Singh
Under Submission.
Utilize pre-trained Text-to-Image Diffusion models to achieve video in-painting,
(inserting a new object in a video),
eliminates the need for training a video generation model.
Paper coming soon.
Beyond Seen Primitive Concepts for Attribute-Object Compositional Learning
Nirat Saini,
Khoi Pham,
Abhinav Shrivastava
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
Open-Vocabulary Compositional Zero-shot Learning (OV-CZSL) enables learning of entirely new attributes, objects, and their compositions.
Model can expand beyond their seen vocabulary.
WayEx: Waypoint Exploration using a Single demonstration
Mara Levy,
Nirat Saini,
Abhinav Shrivastava
IEEE International Conference on Robotics and Automation (ICRA) 2024.
WayEx: a novel method for learning complex goal-conditioned robotics tasks from a
single demonstration, using unique reward function and knowledge expansion.
Chop & Learn: Recognizing and Generating Object-State Compositions
Nirat Saini*,
Hanyu Wang*,
Archana Swaminathan,
Vinoj Jayasundara,
Bo He,
Kamal Gupta,
Abhinav Shrivastava
International Conference on Computer Vision (ICCV), 2023.
Benchmark suite for fruits, vegetables and various cutting styles from multiple views.
Compositional Image Generation supports generating
unseen cutting styles of different objects.
Media Blogs: (UMD News,
Disentangling Visual Embeddings for Attributes and Objects
Nirat Saini,
Khoi Pham,
Abhinav Shrivastava
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.   (Oral Presentation)
Compositional Zero-shot Learning (CZSL) for attributes and objects is solved by
disentangling concepts in the visual feature space,
and using those for hallucinating novel complex concepts.
Recognizing Actions using Object States
Nirat Saini,
Bo He,
Gaurav Shrivastava,
Saketh Rambhatla,
Abhinav Shrivastava
Workshop on the Elements of Reasoning: Objects, Structure and Causality at ICLR, 2022.
Using initial and final states of objects (two frames), learns to classify the actions which is being
performed in the video that is causing the object state change.
Learning Graphs for Knowledge Transfer with Limited Labels
Pallabi Ghosh,
Nirat Saini,
Larry Davis,
Abhinav Shrivastava
Conference on Computer Vision and Pattern Recognition (CVPR) , 2021.
Dynamically updates adjacency matrices for Graphs used in Graph Convolutional Networks (GCN), can be
used for Semi-supervised learning and
zero/few-shot action recognition tasks.
All about knowledge graphs for actions
Pallabi Ghosh,
Nirat Saini,
Larry Davis,
Abhinav Shrivastava
ArXiv , 2020.
Analyze different types of Knowldge Graphs (KGs): action embeddings,
action-object embeddings, and visual embeddings, for Zero/Few-shot Action Recognition task.
Explicit Bias Discovery in Visual Question Answering Models
Varun Manjunatha,
Nirat Saini,
Larry Davis,
Conference on Computer Vision and Pattern Recognition (CVPR) , 2019
Analyze the tendency of Visual Question Answering (VQA) models to rely on statistical biases
in the dataset to answer questions, rather than paying attention to the visual content.
Multi-bid spectrum auctions in dynamic spectrum access networks with spatial reuse
Nirat Saini,
Mainak Chatterjee,
International Conference on Communication Systems and Networks (COMSNETS) , 2017.
Design an auction-based spectrum allocation framework designed for transaction between primary users adding
secondary users, while maximizing the spatial reuse capacity.