What I'm Up To
News
- 2025-01 One paper accepted by ICLR 2025, see you in Singapore!
- 2024-12 One paper accepted by AAAI 2025.
- 2024-11 Invited Talk from Safe SuperIntelligence (SSI) Club.
- 2024-04 One paper accepted by IJCAI 2024, see you in Jeju!
- 2023-07 Two papers accepted by ICCV 2023.
- 2023-06 Taming Diffusion Models for Music-driven Conducting Motion
Generation accepted by AAAI 2023 Summer Symposium, with Best Paper
Award.
- 2023-02 Translating natural language to planning goals with
large-language models now on arxiv.
- 2022-06 LaT: Latent Translation with Cycle-Consistency for Video-Text
Retrieval now on arxiv.
Professional Services
- Reviewer: ECCV 2022, ACCV 2022, CVPR 2023, ICCV
2023, ACM MM 2023, EMNLP 2023, ICASSP 2024, CVPR
2024, ICPR 2024, Artificial Intelligence Review, ECCV 2024, ACM MM 2024, COLM 2024, NeurIPS 2024, ICASSP 2025, ICLR 2025, AISTATS 2025, CVPR 2025, ICML 2025, IJCNN 2025, COLM 2025
- Program Committee Member: AAAI 2023, AAAI 2024, AAAI 2025
Selected Publications
|
HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing
Jinbin Bai*, Wei Chow*, Ling Yang, Xiangtai Li, Juncheng Li, Hanwang Zhang, Shuicheng Yan
Technical Report, 2024
[Paper]
[Page]
[Dataset]
[Code]
HumanEdit is a high-quality, human-rewarded dataset specifically designed for instruction-guided image editing benchmark.
|
|
|
|
🔥Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Jinbin Bai*, Tian Ye*, Wei Chow, Enxin Song, Xiangtai Li, Zhen Dong, Lei Zhu, Shuicheng Yan
ICLR 2025
[Paper]
[Model]
[Code]
[Demo]
[Discord_Discussion]
[Toturial_EN]
[Toturial_JA]
[Media_Report_CN]
Meissonic is a non-autoregressive mask image modeling text-to-image synthesis model that can generate high-resolution images. It is designed to run on consumer graphics cards. The left figure is generated by Meissonic.
|
|
|
|
ViewControl: Intergrating View Conditions for Image
Synthesis
Jinbin Bai, Zhen Dong, Aosong Feng, Xiao Zhang, Tian Ye, Kaicheng Zhou
IJCAI 2024
[Paper]
[Code]
This paper presents a novel framework that enhances existing
models with awareness of viewpoint information, thereby
enabling improved control over text-to-image diffusion
models.
|
|
|
|
Taming diffusion models for music-driven conducting motion generation
Zhuoran Zhao*, Jinbin Bai*, Delong Chen, Debang Wang, Yubo Pan
AAAI 2023 Summer Symposium, Best Paper Award
[Paper]
[Code]
|
|
|
|
Adverse Weather Removal with Codebook Priors
Tian Ye*, Sixiang Chen*, Jinbin Bai*, Jun Shi, Chenghao Xue, Jingxia Jiang, Junjie Yin, Erkang Chen, Yun Liu
ICCV 2023
[Paper]
|
|
|
|
LaT: Latent Translation with Cycle-Consistency for
Video-Text Retrieval
Jinbin Bai, Chunhui Liu, Feiyue Ni, Haofan Wang,
Mengying Hu, Xiaofeng Guo, Lele Cheng
Technical Report, 2022
[Paper]
A novel latent translation framework for solving the
modality gap problem. With this framework, we can align two
modalities with only a translation network, without
fine-tuning the encoder.
|
|
|
|
Semantic-aware Cartoon Style Transfer
Jinbin Bai
Technical Report, 2021
[Paper]
A new semantic-aware framework by matching the same semantic
regions for cartoon style transfer.
|
|
|
Miscellaneous
- I am a huge fan of Cities: Skylines and I love designing and simulating cities. I can't
wait for the release of Cities: Skylines II on Oct 24th, 2023! And, I've attended World Cities Summit (WCS) 2024 Conference!
- My favorite movies in recent years is Free Guy, and I dream of designing a game like
this.
- In my leisure moments, I delight in playing the piano surrounded by abundant greenery, finding peace and emotional comfort in the solitude.
|