Junke Wang 「王君可」
I'm a 4th year Ph.D. student in school of computer science at Fudan University, supervised by Prof. Zuxuan Wu and Prof. Yu-Gang Jiang. I was very fortunate to be mentored by Dongdong Chen and Yi Jiang.
My research interest lies in computer vision and deep learning, with the emphasis on multimodal understanding and generation. I developed Omni-series models, including OmniTokenizer (one codebook for image-video joint tokenization), OmniVid (a generative framework for general video understanding), OmniTracker (a unified tracking model), and OmniVL (an image-video-language foundation model).
I'm now working on training world models, which can simulate real-world environments and interact with embodied agents.
Email: wangjk21 [at] m.fudan.edu.cn
Google Scholar   /  
Github
|
|
Publication
|
(* denotes equal contribution)
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation.
[Code]
Junke Wang, Yi Jiang, Zehuan Yuan, Binyue Peng, Zuxuan Wu, Yu-Gang Jiang.
NeurIPS, 2024.
|
OmniTracker: Unifying Object Tracking by Tracking-with-Detection.
Junke Wang, Dongdong Chen, Zuxuan Wu, Chong Luo, Xiyang Dai, Lu Yuan, Yu-Gang Jiang.
TPAMI, 2025.
|
OmniVid: A Generative Framework for Universal Video Understanding.
[Code]
Junke Wang, Dongdong Chen, Chong Luo, Bo He, Lu Yuan, Zuxuan Wu, Yu-Gang Jiang.
CVPR, 2024.
|
Look Before You Match: Instance Understanding Matters in Video Object Segmentation.
Junke Wang, Dongdong Chen, Zuxuan Wu, Chong Luo, Chuanxin Tang, Xiyang Dai, Yucheng Zhao, Yujia Xie, Lu Yuan, Yu-Gang Jiang.
CVPR, 2023.
|
OmniVL: One Foundation Model for Image-Language and Video-Language Tasks.
Junke Wang, Dongdong Chen, Zuxuan Wu, Chong Luo, Luowei Zhou, Yucheng Zhao, Yujia Xie, Ce
Liu, Yu-Gang Jiang, Lu Yuan.
NeurIPS, 2022.
|
Efficient Video Transformers with Spatial-Temporal Token Selection.
[Code]
Junke Wang*, Xitong Yang*, Hengduo Li, Zuxuan Wu, Yu-Gang Jiang.
ECCV, 2022.
|
M2TR: Multi-modal Multi-scale Transformer for Deepfake Detection.
[Code]
Junke Wang, Zuxuan Wu, Wenhao Ouyang, Xintong Han, Jingjing Chen, Ser-Nam Lim, Yu-Gang
Jiang
ICMR, 2022.
|
ObjectFormer for Image Manipulation Detection and Localization.
[Code]
Junke Wang, Zuxuan Wu, Jingjing Chen, Xintong Han, Abhinav Shrivastava, Yu-Gang Jiang,
Ser-Nam Li.
CVPR, 2022.
|
FT-TDR: Frequency-guided Transformer and Top-Down Refinement Network for Blind Face
Inpainting.
Junke Wang, Shaoxiang Chen, Zuxuan Wu, Yu-Gang Jiang.
TMM, 2022.
|
Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition.
Yuqian Fu, Li Zhang, Junke Wang, Yanwei Fu, Yu-Gang Jiang.
ACM MM, 2020.
|
Projects
|
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning.
[Dataset]
[Project page]
Junke Wang*, Lingchen Meng*, Zejia Weng, Bo He, Zuxuan Wu, Yu-Gang Jiang.
We introduce a fine-grained visual instruction dataset, LVIS-INSTRUCT4V, which contains 220K visually aligned and context-aware instructions produced by prompting the powerful GPT-4V with images from LVIS.
|
ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System.
[Code]
Junke Wang, Dongdong Chen, Chong Luo, Xiyang Dai, Lu Yuan, Zuxuan Wu, Yu-Gang Jiang.
We present our vision for multimodal and versatile video understanding and propose a prototype system, ChatVideo.
|
Preprints
|
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL.
[Code]
Junke Wang, Zhi Tian, Xun Wang, Xinyu Zhang, Weilin Huang, Zuxuan Wu, Yu-Gang Jiang
Arxiv, 2025.
|
Perception Encoder: The best visual embeddings are not at the output of the network.
[Code]
PE Team from FAIR, Meta.
Arxiv, 2025.
|
Pix2Cap-COCO: Advancing Visual Comprehension via Pixel-Level Captioning.
Zuyao You, Junke Wang, Lingyu Kong, Bo He, Zuxuan Wu
Arxiv, 2025.
|
Fighting Malicious Media Data: A Survey on Tampering Detection and Deepfake Detection.
Junke Wang, Zhenxin Li, Chao Zhang, Jingjing Chen, Zuxuan Wu, Larry S. Davis, Yu-Gang
Jiang.
Arxiv, 2022.
|
Academic Services
Conference Reviewer for CVPR, ICCV, ICML, NeurIPS, ICLR, ECCV, etal.
Journal Reviewer for TPAMI, TIP, IJCV, etal.
|
Awards
Fundamental Research Program for PhD students, sponsored by NSFC. 2024.
Young Elite Scientists Sponsorship Program for PhD students, sponsored by CAAI. 2024.
Intel Fellowship. 2023.
National Scholarship (Top 1%). 2022.
Outstanding graduates in Shanghai (undergrads). 2021.
First-class Scholarship (Top 5%). 2019, 2021.
Uniqlo Scholarship (33 undergrads from China). 2019.
|
|