Multimodal AI researcher obsessed with how machines perceive, remember, and generate the world. Based in Mountain View, CA. (Friends call me the "Evals Shill" for a reason.) Currently post-training Adobe's image gen models to push creative boundaries.
PhD from UMD focused on diffusion model memorization. Built evals that actually test video understanding – like CinePile (long-video QA benchmark, Best Paper at CVPR 2024 SynCV) and ARGUS (hallucination/omission eval for dense captions).
Before academia: did SGD in industry for a while in India, IIT Madras alum, founded a Fashion AI startup that was way too early to the party.
Open to collabs on generative modeling (evals + post-training). Hit me up: gowthami [dot] somepalli [at] gmail.com
// featured writing
// papers
// news
- Jul '24 2 papers at NeurIPS'24
- Jul '24 CSD accepted to ECCV'24
- Jun '24 Best paper award at Synth4CV, CVPR
- Mar '24 Ann-Wylie Fellowship
- Dec '23 Talk at NeurIPS Diffusion Workshop
- Dec '22 Work covered in TechCrunch