I am a graduate student in Computer Science at the University of Maryland, College Park advised by Prof. Tom Goldstein. My broader research focus lies at the intersection of Machine Learning (ML) and Computer Vision with the aim of building robust generalist models. I've been called "that memorization girl", since I wrote multiple papers studying memorization in diffusion models. Check out this video to get a peek into that part of my research. Since last summer I have started working on video LLMs and ended up creating a long-video understanding benchmark - CinePile. I also worked on the topic of contextual captioning in Video-LLMs and the pre-print will be out very soon! I am a recipient of Kulkarni Fellowship, Amazon Research Fellowship, CVPR Doctoral Consortium Award, and Ann-Wylie Dissertation Fellowship.
Before starting my Ph.D., I worked in the industry for 8 years in various product and engineering roles and received my bachelor's from IIT Madras. My pivot to machine learning happened during my start-up years when I was building ML tools for fashion assistance. I’ve since been working towards building interpretable machine learning models with real-world applications.
I am always up for new collaborations, drop me an email if you want to chat! If you are looking for mentorship, drop an short email introducing yourself and the topic of your interest.
Video captions created by current Video LLMs are quite descriptive. In this paper we discuss the problem of contextual captioning and proposed some ways to train such model. Working hard to release the model too.
|
|
CinePile: A Long Video Question Answering Dataset and Benchmark
CVPR 2024 Synth4CV Best Paper Award
Long video understanding dataset with 300,000 train QAs and 5000 evaluation QAs. Built on top of real human annotations using powerful LLMs with humans in the loop.
|
|
Can we measure the style similarity between images? We propose a way to extract style from images. We call this Contrastive Style Descriptors (CSD). Using this model, we study the style replication in image generation models.
Gowthami Somepalli, Anubhav Gupta, Kamal Gupta, Shramay Palta, Micah Goldblum, Jonas Geiping, Abhinav Shrivastava , Tom Goldstein.
|
|
We study why diffusion models copy and ways to mitigate the same. We found text conditioning plays a major role along with training data duplication. One way to mitigate is to use multiple captions per data point during training.
|
|
Understanding the training data replication in diffusion models. Examined DDPM models on Celeb-A and Oxford flowers and LAION-Aesthetics on Stable Diffusion v.1.4.
|
|
Can Neural Nets Learn the Same Model Twice? Investigating Reproducibility and Double Descent from the Decision Boundary Perspective
CVPR 2022, Oral
Understanding the reproduciblity of various architectures from the decision boundary perspective. We also examine how the decision boundaries change as we increase the model capacity in the case of double-descent.
Gowthami Somepalli, Liam Fowl, Arpit Bansal, Ping Yeh-Chiang, Yehuda Dar, Richard Baraniuk , Micah Goldblum, Tom Goldstein.
|
|
SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training
NeurIPS 2022, TRLW
Improving predictions on structured tabular data using intersample attention and contrastive learning.
|
|
Emergent communication via mid-level patches in a referential game played on a large-scale image dataset.
Kamal Gupta, Gowthami Somepalli, Anubhav Gupta, Vinoj Jayasundara, Matthias Zwicker, Abhinav Shrivastava
|
|
Unsupervised Anomaly Detection with Adversarial Mirrored AutoEncoders
UAI 2021, Long Presentation
We proposed mirrored wasserstein loss along with latent space regularization to recognize anomalies in case where there are no/ a few anomalies present during training time.
Gowthami Somepalli, Yexin Wu, Yogesh Balaji, Bhanukiran Vinzamuri, Soheil Feizi.
|
|
Prioritizing and characterizing functionally relevant genes across human tissues
PLOS Computational Biology 2021
In this work, we try to understand and predict which genes are important in any given tissue. We show that the gene expression is not the only important factor, but the location of the gene in PPI network also plays a role.
Gowthami Somepalli, Sarthak Sahoo, Arashdeep Singh, Sridhar Hannenhalli.
|
|
What Doesn't Kill You Makes You Robust (er): Adversarial Training against Poisons and Backdoors
Preprint 2021
In this paper, we desensitize networks to the effects of poisoning by creating poisons during training and injecting them into training batches.
|