Sean Kirmani

Downloadable PDF

Work Experience

Google DeepMind — Senior Researcher [Apr 2023 - Present]

Machine learning research in vision, language, and robotics. Mountain View, California.

Everyday Robots — Technical Lead, Semantic Perception [Nov 2021 - Apr 2023]

Everyday Robots spun out of X in November 2021. Mountain View, California.

  • Vision-language model lead at Everyday Robots. Introduced and deployed first vision and language model (CLIP) in production for robot visual question answering (VQA). Scaled diffusion models to create synthetic data for CLIP. Landed on-robot open-vocabulary object detector to detect novel objects.
  • Designed and built a multi-sensor (camera and lidar), open-vocabulary panoptic segmentation model.
  • Full-stack ML engineer: end-to-end ownership of entire ML flywheel from data collection to inference. Built model automation pipeline for data collection, training, evaluation, and on-robot deployment for all perception models.
X — Senior Machine Learning Engineer, The Everyday Robot Project [July 2018 – Nov 2021]

Early ML engineer at The Everyday Robot Project. Mountain View, California.

  • Early engineer on the perception team. Expert in bringing research to production in real world systems.
  • Created the lidar panoptic segmentation model and RGB-D camera panoptic segmentation model (with associated automation flywheel) and deployed to robot fleet.
  • Trained multimodal vision and action models, resulting in publication at ICRA.
  • Filed 5 patents and published 1 paper.
  • Built the first 3D object tracker.
X — Software Engineering Intern, The Everyday Robot Project [May 2017 – Aug 2017]

Worked on perception for human-robot interaction. Mountain View, California.

Google — Software Engineering Intern, Project Tango [May 2016 – Aug 2016]

Worked on experimental augmented reality. Created environmental lighting system allowing more photorealistic lighting and reflections in augmented reality for Tango SDK. Published Google Developer Blog post with tutorial for usage. Also experimented with video stabilization. Experience in computer vision, computer graphics, and computational photography. Worked with C++, Unity, and Java. Mountain View, California.

Google — Software Engineering Intern, Chrome for Android [May 2015 – Aug 2015]

Served as an intern on tools and infrastructure for Chrome for Android. All code is open source as part of Chromium. Wrote test infrastructure for sign-in authentication test. Also created parametrizable testing framework. All my code is open source as part of Chromium! Worked with Java, Python, and C++. Mountain View, California.

Accordion Health — Software Engineer [Aug 2014 – Jan 2015]

Used machine learning for health care data analytics. Clustered co-morbidity for several sets of patients. Experience in data visualization. Worked with R, Python, and D3.js. Austin, Texas.

Internet Marketing Inc. — Web Developer Intern [Jun 2013 – Aug 2013]

Set up Unix servers and configured SQL databases. Developed over 20 websites in the summer. Managed and maintained cloud servers. Worked with HTML, CSS, PHP, JavaScript, and jQuery. Las Vegas, Nevada.


Language to Rewards for Robotic Skill Synthesis
Wenhao Yu, Nimrod Gileadi, Chuyuan Fu, Sean Kirmani, Kuang-Huei Lee, Montse Gonzalez Arenas, Hao-Tien Lewis Chiang, Tom Erez, Leonard Hasenclever, Jan Humplik, Brian Ichter, Ted Xiao, Peng Xu, Andy Zeng, Tingnan Zhang, Nicolas Heess, Dorsa Sadigh, Jie Tan, Yuval Tassa, Fei Xia

Open-World Object Manipulation using Pre-Trained Vision-Language Models
Austin Stone, Ted Xiao, Yao Lu, Keerthana Gopalakrishnan, Kuang-Huei Lee, Quan Vuong, Paul Wohlhart, Sean Kirmani, Brianna Zitkovich, Fei Xia, Chelsea Finn, Karol Hausman

Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators
Alexander Herzog, Kanishka Rao, Karol Hausman, Yao Lu, Paul Wohlhart, Mengyuan Yan, Jessica Lin, Montserrat Gonzalez Arenas, Ted Xiao, Daniel Kappler, Daniel Ho, Jarek Rettinghouse, Yevgen Chebotar, Kuang-Huei Lee, Keerthana Gopalakrishnan, Ryan Julian, Adrian Li, Chuyuan Kelly Fu, Bob Wei, Sangeetha Ramesh, Khem Holden, Kim Kleiven, David Rendleman, Sean Kirmani, Jeff Bingham, Jon Weisz, Ying Xu, Wenlong Lu, Matthew Bennice, Cody Fong, David Do, Jessica Lam, Yunfei Bai, Benjie Holson, Michael Quinlan, Noah Brown, Mrinal Kalakrishnan, Julian Ibarz, Peter Pastor, Sergey Levine

In the proceedings of Robotics: Science and Systems (RSS), 2023.

Practical Imitation Learning in the Real World via Task Consistency Loss
Mohi Khansari, Daniel Ho, Yuqing Du, Armando Fuentes, Matthew Bennice, Nicolas Sievers, Sean Kirmani, Yunfei Bai, Eric Jang

In the proceedings of International Conference on Robotics and Automation (ICRA), 2023.

PRISM: Pose Registration for Integrated Semantic Mapping
Justin Hart, Rishi Shah, Sean Kirmani, Nick Walker, Kathryn Baldauf, Nathan John, Peter Stone

In the proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018.

Passive Demonstrations of Light-Based Robot Signals for Improved Human Interpretability
Rolando Fernandez, Nathan John, Sean Kirmani, Justin Hart, Jivko Sinapov, Peter Stone

In the proceedings of IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 2018.


The University of Texas at Austin
Bachelor of Science, Computer Science
Bachelor of Science, Electrical Engineering

Honors Thesis: Deep Reinforcement Learning for Aerial Obstacle Avoidance using Monocular RGB Images

Selected Coursework: Show
  • Robot Learning (CS 395T)
  • Human Robot Interaction (EE 382V)
  • Artificial Intelligence (CS 343)
  • Neural Networks (CS 342)
  • Computer Vision (CS 378H)
  • Computer Graphics (CS 354)
  • Physical Simulation (CS 395T)
  • Operating Systems (CS 439)
  • Signal Processing (EE 313)
  • Computer Architecture (EE 460N)
  • Embedded Systems (EE 445L)

University Research

Robotics Lab at UT Austin, Dr. Peter Stone/Dr. Justin Hart [Dec 2017 – Apr 2018]

Worked on semantic mapping and social navigation for non-anthropomorphic robots with Building-wide Intelligence (BWI) project. Austin, Texas.

Robotics Lab at UT Austin, Dr. Andrea Thomaz/Dr. Scott Niekum [Jan 2016 – May 2017]

Research in human robot interaction in the Personal Autonomous Robotics Lab (PeARL) and Socially Intelligent Machines (SiM) Lab. Experience in behavior architectures, perception, manipulation, and machine learning. Austin, Texas.

Wireless Networking & Communication Group, Dr. Joydeep Ghosh [Aug 2014 – Jan 2016]

Selected by Professor Joydeep Ghosh in the University of Texas Electrical and Computer Engineering department in the Intelligent Data Exploration and Analysis Laboratory (IDEAL). Lab focuses on machine learning and data mining. Research on making self-driving cars a safe reality using distributed machine learning through wireless mmWave communication in collaboration with Dr. Robert Heath. [In the news] Austin, Texas.

Contact Info

Research Interests

  • Computer Vision
  • Natural Language Processing
  • Robot Learning
  • Reinforcement Learning
  • Deep Learning
  • Artificial Intelligence


Gatorbotics [Nov 2018 - Jan 2021]

Mentor for FIRST Robotics Competition for team 1700. Palo Alto, CA.

I'm also a...