Jinbin Bai

I graduated from Department of Computer Science at National Univeristy of Singapore (NUS), and am working on image and video synthesis and unified foundation model. More generally, I am interested in interactive content creation, multimedia processing technologies for computational art and design. My goal is to design algorithms and build tools to make it easier for artists and designers to create cool things.

We are looking for motivated collaborators and support from industry partners. Regardless feel free to reach out if you have extra H100s😁 (or to collab!)

If you would like to chat about life, travel, career plan, or research ideas. I will dedicate at least 30 mins every week for such informal meetings. I encourage persons from underrepresented groups to reach out.

Email  /  Google Scholar /  Github  /  Hugging Face  /  Twitter  /  Discord

profile photo
Research

An overview of Jinbin's Research

News

  • 2025-04   One paper accepted by CVPR 2025 AI for Content Creation Workshop.
  • 2025-04   One paper accepted by IJCAI 2025.
  • 2025-04   Invited Talk from Riot Video Games.
  • 2025-03   Awarded Frontier Top Ten Young Scholars Award (1st) from Century Frontier Asset Management.
  • 2025-03   Invited Talk from University of Illinois Urbana-Champaign (UIUC).
  • 2025-02   One paper accepted by CVPR 2025.
  • 2025-01   One paper accepted by ICLR 2025, see you in Singapore!
  • 2024-12   One paper accepted by AAAI 2025.
  • 2024-11   Invited Talk from Safe SuperIntelligence (SSI) Club.
  • 2024-04   One paper accepted by IJCAI 2024, see you in Jeju!
  • 2023-08   One paper accepted by BMVC 2023.
  • 2023-07   Two papers accepted by ACM MM 2023.
  • 2023-07   Two papers accepted by ICCV 2023.
  • 2023-06   Taming Diffusion Models for Music-driven Conducting Motion Generation accepted by AAAI 2023 Summer Symposium, with Best Paper Award.
  • 2023-05   One paper accepted by ICIP 2023, see you in Kuala Lumpur!
  • 2023-02   Translating natural language to planning goals with large-language models now on arxiv.
  • 2022-11   One paper accepted by ACCV 2022.
  • 2022-06   LaT: Latent Translation with Cycle-Consistency for Video-Text Retrieval now on arxiv.
  • 2021-03   Awarded as Outstanding Graduate by Nanjing University.
  • 2019-03   Awarded as Outstanding Student by Nanjing University.

Selected Publications

LaT


Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Jinbin Bai*, Tian Ye*, Wei Chow, Enxin Song, Xiangtai Li, Zhen Dong, Lei Zhu, Shuicheng Yan
ICLR 2025
[Paper] [Model] [Code] [Demo] [Discord_Discussion] [Toturial_EN] [Toturial_JA] [Media_Report_CN]
Meissonic is a text-to-image discrete diffusion model that can generate high-resolution images. It is designed to run on consumer graphics cards. The left figure is generated by Meissonic.

Miscellaneous

  • I am a huge fan of Cities: Skylines and I love designing and simulating cities. I can't wait for the release of Cities: Skylines II on Oct 24th, 2023! And, I've attended World Cities Summit (WCS) 2024 Conference!
  • My favorite movies in recent years is Free Guy, and I dream of designing a game like this.
  • In my leisure moments, I delight in playing the piano surrounded by abundant greenery, finding peace and emotional comfort in the solitude.
  • I enjoy traveling and have visited 12 countries, guess where I have been?
  • I like swimming, diving, surfing, beach under the sunchine.

Last updated on Nov. 2024