Sean Kirmani

Downloadable PDF

Work Experience

Google DeepMind — Senior Research Scientist [Apr 2023 - Present]

Artificial intelligence research in vision, language, and robotics. Mountain View, California.

Everyday Robots — Research Lead, Semantic Perception [Nov 2021 - Apr 2023]

Everyday Robots spun out of Google[x] in November 2021. Mountain View, California.

  • Vision-language model lead at Everyday Robots. Introduced and deployed first vision and language model (CLIP) in production for robot visual question answering (VQA). Scaled diffusion models to create synthetic data for CLIP. Landed on-robot open-vocabulary object detector to detect novel objects.
  • Designed and built a multi-sensor (camera and lidar), open-vocabulary panoptic segmentation model.
  • Full-stack ML engineer: end-to-end ownership of entire ML flywheel from data collection to inference. Built model automation pipeline for data collection, training, evaluation, and on-robot deployment for all perception models.
Google[x] — Senior Research Engineer, The Everyday Robot Project [July 2018 – Nov 2021]

Early computer vision engineer at The Everyday Robot Project. Mountain View, California.

  • Early engineer on the perception team. Expert in bringing research to production in real world systems.
  • Created the lidar panoptic segmentation model and RGB-D camera panoptic segmentation model (with associated automation flywheel) and deployed to robot fleet.
  • Trained multimodal vision and action models, resulting in publication at ICRA.
  • Filed 5 patents and published 1 paper.
  • Built the first 3D object tracker.
Google[x] — Research Engineering Intern, The Everyday Robot Project [May 2017 – Aug 2017]

Worked on perception for human-robot interaction. Mountain View, California.

Google — Software Engineering Intern, Project Tango [May 2016 – Aug 2016]

Worked on experimental augmented reality. Created environmental lighting system allowing more photorealistic lighting and reflections in augmented reality for Tango SDK. Published Google Developer Blog post with tutorial for usage. Also experimented with video stabilization. Experience in computer vision, computer graphics, and computational photography. Worked with C++, Unity, and Java. Mountain View, California.

Google — Software Engineering Intern, Chrome for Android [May 2015 – Aug 2015]

Served as an intern on tools and infrastructure for Chrome for Android. All code is open source as part of Chromium. Wrote test infrastructure for sign-in authentication test. Also created parametrizable testing framework. All my code is open source as part of Chromium! Worked with Java, Python, and C++. Mountain View, California.

Accordion Health — Software Engineer [Aug 2014 – Jan 2015]

Used machine learning for health care data analytics. Clustered co-morbidity for several sets of patients. Experience in data visualization. Worked with R, Python, and D3.js. Austin, Texas.

Internet Marketing Inc. — Web Developer Intern [Jun 2013 – Aug 2013]

Set up Unix servers and configured SQL databases. Developed over 20 websites in the summer. Managed and maintained cloud servers. Worked with HTML, CSS, PHP, JavaScript, and jQuery. Las Vegas, Nevada.

Education

The University of Texas at Austin
Bachelor of Science, Computer Science
Bachelor of Science, Electrical Engineering

Honors Thesis: Deep Reinforcement Learning for Aerial Obstacle Avoidance using Monocular RGB Images

Contact Info

Research Interests

  • Computer Vision
  • Natural Language Processing
  • Robot Learning
  • Reinforcement Learning
  • Deep Learning
  • Artificial Intelligence

Workshops

Building Physically Plausible World Models

Co-organizer. ICML 2025

Volunteering

Gatorbotics [Nov 2018 - Jan 2021]

Mentor for FIRST Robotics Competition for team 1700. Palo Alto, CA.

I'm also a...