I’m Sean Kirmani, an AI researcher.
I am currently an research scientist at Google DeepMind working on problems in vision, language, and action.
I was previously an early ML engineer at Everyday Robots, an Alphabet Company that graduated from Google[x]. I worked on vision-language models, few-shot learning, camera detectors, lidar detectors, and 3D object trackers.
I also did a stint in augmented reality where I worked on image-based lighting with the Project Tango team at Google VR/AR.
I graduated from The University of Texas at Austin with degrees in Electrical Engineering and Computer Science. I spent four years as an AI research assistant with various professors in the Robotics Lab at UT Austin.
I enjoy working on all things AI — with a bent towards perception and language. I’m most interested in generative models for vision and text. Most of all, I’m interested in using AI to improve society in practical ways. A summary of my technical interests are below:
I was born in Los Angeles and currently live in San Francisco. When I'm not in front of a computer, I'm often outdoors, exploring, or behind a lens.
My résumé is also available.