Hello! I’m Chen Ju (鞠陈).

I’m a final-year PhD candidate at MediaBrain Group, Shanghai Jiao Tong University, advised by Prof. Yanfeng Wang (上海AI LAB主任助理) and Prof. Ya Zhang (国家万人), also collaborating with Prof. Weidi Xie, Prof. Siheng Chen, Prof. Yu Wang and Prof. Jiangchao Yao. Before that, I obtained a Bachelor’s degree in Engineering from University of Electronic Science and Technology of China, where I studied under Prof. Yong Liu (国家杰青 & 长江学者), awarded with the honor of Outstanding Graduate.

Currently, I collaborate closely with some outstanding researchers from TAO Technology (拍立淘), Alibaba: Dr. Weilin Huang, Dr. Shuai Xiao, Dr. Xu Chen, and Dr. Zhonghua Zhai. The vision is to develop large-scale visual searching system for various e-commerce applications, such as superlarge-scale multi-modal learning (10-billion image-text product data), AIGC (GPT & Diffusion).

Before, I study with some outstanding researchers from WeChat Technology (微信技术架构), Tencent: Dr. Fengyun Rao, Dr. Yizhou Zhou, Dr. Guangting Wang and Dr. Yukun Su, working to develop chinese pre-trainings of image-text-video-music, namely WeMM, WeCLIP, WeMU.

Earlier, I cooperate with some outstanding researchers from PanGu Large Model (盘古大模型), Huawei: Prof. Qi Tian, Dr. Lingxi Xie, Dr. Xiaopeng Zhang, Dr. Jianlong Chang, Dr. Jiemin Fang, and Dr. Peisen Zhao, to explore MLLM for B-side industrial scenarios.

I’m now leading one small group that mainly works on Efficient Data Governance (Cleaner; Organizer; Compressor; Distiller; Synthesizer; Evolver). Actively recruiting research interns and engineering interns, please feel free to contact me!

Email: ju_chen[at]sjtu[dot]edu[dot]cn / cju[dot]void[at]gmail[dot]com           Google Scholar: Citations 620+, H-index 10, I10-index 9

🔥 News

💻 Researches

My primary research interests lie in

  • Vision-Language-Music Learning: Multi-Modal Pre-training, Efficient Adaptation, Accelerate Deployment for Downstream Tasks.

  • Data Governance & Mining: Clean & Compress & Synthesize Data, Cross-Modal Retrieval & Recommendation & Advert.

  • AIGC:Generation or Editing for Image & Video & Music, Conversation-Driven Multi-Modal Understanding and Composition.

  • Video Understanding: Retrieval & Caption & Summary for Video Clips, Detection & Classification for Untrimmed Long Videos.

As a young researcher, your interest and kind citation will definitely mean a lot for me and my collaborators. Also feel free to drop me an email for any suggestions or potential collaborations.

📝 Publications

📒 Topic: Efficiently Adapt Multi-modal Foundation Models to Unify/Generalize Downstream Tasks

  1. Prompting Visual-Language Models for Efficient Video Understanding | [Project] | [Code & Data] | [Report] | [Bibtex]
    Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang and Weidi Xie
    ECCV 2022

  2. Collaborating Vision-Language Pre-training with Weakly-Supervised Video Understanding | [Project & Code] | [Bibtex]
    Chen Ju, Kunhao Zheng, Jinxiang Liu, Peisen Zhao, Ya Zhang, Jianlong Chang, Qi Tian and Yanfeng Wang
    CVPR 2023

  3. Turbo: Informativity-Driven Acceleration Plugin for Vision-Language Foundation Models | [Bibtex]
    Chen Ju, Haicheng Wang, Zeqian Li, Xu Chen, Zhonghua Zhai, Weilin Huang and Shuai Xiao
    arXiv preprint 2023

📒 Topic: Vision-Language-Audio Pre-trainings & Inference with Strong Generalization but Low Costs

  1. Transformation Invariance and Equivariance for Self-supervised Sound Localization | [Project & Demo] | [Code] | [Bibtex]
    Jinxiang Liu, Chen Ju, Weidi Xie and Ya Zhang
    ACM Multimedia 2022

  2. Audio-Visual Segmentation via Unlabeled Frames Exploitation | [Bibtex]
    Jinxiang Liu, Yikun Liu, Fei Zhang, Chen Ju, Yanfeng Wang and Ya Zhang
    CVPR 2024

  3. Contrast and Unity for Partially-Supervised Temporal Sentence Grounding | [Project & Code] | [Bibtex]
    Chen Ju, Haicheng Wang, Jinxiang Liu, Chaofan Ma, Ya Zhang
    arXiv preprint 2023

  4. SAM Guided Annotation-free Audio-Visual Cross-modal Segmentation | [Project & Code] | [Bibtex]
    Jinxiang Liu, Yu Wang, Chen Ju, Chaofan Ma, Ya Zhang, Weidi Xie
    WACV 2024

📒 Topic: Understand World through Open-Vocabulary Learning, and also Rethinking Limitations

  1. Multi-modal GPT Prompts for Open-Vocabulary Video Understanding | [Project & Code] | [Bibtex]
    Chen Ju, Zeqian Li, Peisen Zhao, Ya Zhang, Xiaopeng Zhang, Qi Tian, Yanfeng Wang and Weidi Xie
    Springer IJCV

  2. Open-Vocabulary Semantic Segmentation via Attribute Decomposition-Aggregation | [Bibtex]
    Chaofan Ma, Yuhuan Yang, Chen Ju, Fei Zhang, Ya Zhang and Yanfeng Wang
    NIPS 2023

  3. Multi-Modal Prototypes for Open-Set Semantic Segmentation | [Bibtex]
    Yuhuan Yang, Chaofan Ma, Chen Ju, Ya Zhang and Yanfeng Wang
    arXiv preprint 2023

  4. DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition | [Bibtex]
    Haozhe Cheng, Chen Ju, Haicheng Wang, Jinxiang Liu, Mengting Chen, Qiang Hu, Xiaoyun Zhang and Yanfeng Wang
    arXiv preprint 2024

📒 Topic: Innovative AIGC Creativity, Free Vision-Text-Audio Editing and Composition

  1. DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery | [Bibtex]
    Chaofan Ma, Yuhuan Yang, Chen Ju, Fei Zhang, Jinxiang Liu, Yu Wang, Ya Zhang and Yanfeng Wang
    ICCV 2023

  2. Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment | [Project] | [Bibtex]
    Mengting Chen, Xi Chen, Zhonghua Zhai, Chen Ju, Xuewen Hong, Jinsong Lan and Shuai Xiao
    arXiv preprint 2024

📒 Topic: Freeze Pre-trainings, Downstream Video Understanding with Limited Annotation & Supervision

  1. Divide and Conquer for Single-frame Temporal Action Localization | [Project & Demo] | [Bibtex]
    Chen Ju, Peisen Zhao, Siheng Chen, Ya Zhang, Yanfeng Wang and Qi Tian
    ICCV 2021

  2. Bottom-Up Temporal Action Localization with Mutual Regularization | [Demo] | [Code] | [Bibtex]
    Peisen Zhao, Lingxi Xie, Chen Ju, Ya Zhang, Yanfeng Wang and Qi Tian
    ECCV 2020

  3. Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization | [Project & Demo] | [Bibtex]
    Chen Ju, Peisen Zhao, Siheng Chen, Ya Zhang, Xiaoyun Zhang and Qi Tian
    IEEE Transactions on Multimedia

  4. Audio-Aware Query-Enhanced Transformer for Audio-Visual Segmentation | [Project & Code] | [Bibtex]
    Jinxiang Liu, Chen Ju, Chaofan Ma, Yanfeng Wang, Yu Wang, Ya Zhang
    arXiv preprint 2023

📒 Topic: MLLMs Guided Multi-Modal Information Retrieval & Sorting & Recall & Representation

  1. Enhancing Cross-domain Click-Through Rate Prediction via Explicit Feature Augmentation | [Bibtex]
    Xu Chen, Zida Cheng, Jiangchao Yao, Chen Ju, Weilin Huang, Xiaoyi Zeng and Shuai Xiao
    WWW 2024

  2. Image to Multi-Modal Retrieval Learning for Industrial Scenarios | [Bibtex]
    Zida Cheng, Chen Ju, Xu Chen, Zhonghua Zhai, Shuai Xiao and Junchi Yan
    arXiv preprint 2023

  3. Cell Variational Information Bottleneck Network
    Zhonghua Zhai, Chen Ju, Shuai Xiao, Jinsong Lan and Xiaoyi Zeng
    arXiv preprint 2023

🗞️ Academics and Communications

  • PC Member & Conference Reviewer: ECCV 2024/2022, CVPR 2024/2023, AAAI 2024/2023, ICCV 2023, ACM MM 2024/2023, WACV 2024
  • Journal Reviewers: IEEE T-PAMI, Springer IJCV, IEEE T-MM, IEEE TCSVT, NPL

  • I am fortunate to meet many interesting people & Team:
  1. University System.   UESTC: Yong Liu, Yadong Jiang.   PKU: Hong Liu, Jin Luo, Donglin Liu, Yong Peng.   THU: Shousheng Han, Zhengsong Wang, Zongren Dai.   SJTU: Haicheng Wang, Jinxiang Liu, Yue Hu, Chenxin Xu, Chaoqin Huang, Xiaoman Zhang, Xuehui Wang, Jiazhong Ceng, Chen Yang.   USTC: Jiaqing Gao, Yumin Xia, Qi Meng.   Oxford: Tengda Han, Charig Yang.   KU Leuven: Haien Tang, Chunzhuo Wang, Liting Yang.   NUS: Jialin Gao   OpenGVLab: Xue Yang.   Ruijin: Qinwei Xu

  2. Alibaba.   TAO Technology: Zida Cheng, Mengting Chen, Xuewen Hong, Yixuan Huang, Lianyu Du.   DAMO Academy: Chang Zhou, Xi Chen, Mosha Chen.   Alimama: Jiajie Wang, Hao Wu, Yuanzhe Gu.   T-head: Yu Fu, He Guo.   AntGroup: Tong Zhan, Qingpei Guo, Yifei Hu, Ming Yang, Jingdong Chen.

  3. Huawei.   Cloud BU: Yucheng Liu, Yaoming Wang, Shuangrui Ding, Haohang Xu.   Car BU: Maosen Li.   Consumer BG: Yongli Jia, Feilong Chen, Chenliang Hu.   ICT: Liang Zhao, Tongda Li.   2012: Yu Zhou, Guohao Gong.

  4. Baidu.   Big Search: Zhengyang Li, Suqi Chen.   Ernie Bot: Tian Wu, Jiachen Liu.   Phoenix Nest: Chenyang Li.

  5. Tencent.   WXG: Xiaoyi Jia, Honghui Lin, Yongsheng Luo, Tianyi Wang, Zhenghua Liu, Dr. Hongwei Xue, Dr. Dacheng Yin.   CDG: Tianyue Cao.   TEG: Hongfa Wang, Wei Liu.

  6. Software Company.   Meta: Kunhao Zheng.   DiDi: Zhe Xu.   ByteDance: Yichao Xiong, Zhikang Li, Kunyuan Du, Xuan Liao, Yuxuan Jiang, Shiqi Peng, Hangtian Zhao, Jian Li.   Bilibili: Luochen Lv.   ZTE: Xiao Hu.   KuaiShou: Liwei Chen, Kun Xu.   MeiTuan: Yujie Zhong, Yexun Zhang.

  7. Hardware Company.   INVIDIA: Jie Chang, Yangheng Zhao.   Intel: Yujie Pan, Yingying Xue.   Hikvision: Tengfei Hou, Wanshun Gao.   OPPO: Bo Wang, Chen Chen, Haonan Lu.   Honor: Yuanchao Du.

📄 Patents

📖 Educations

  • 2018 - 2024, PhD candidate, Shanghai Jiao Tong University, Shanghai, China
  • 2018, Exchange Student, University of Amsterdam, Netherlands
  • 2018, Exchange Student, KU Leuven, Belgium
  • 2014 - 2018, Undergraduate, University of Electronic Science and Technology of China, Chengdu, China

🎖 Honors and Awards

  • [2024] Top Talent Program by Technology Companies
  • [2023] First Prize of Shanghai Technology Invention Award
  • [2022] CMIC Outstanding Scholarship at SJTU (Top 1%)
  • [2021] CMIC Outstanding Scholarship at SJTU (Top 1%)
  • [2020] CMIC Outstanding Scholarship at SJTU (Top 1%)
  • [2018] Outstanding Graduates of Sichuan Province (Top 1%)
  • [2018] Outstanding Graduates of UESTC (Top 1%)
  • [2017] First Prize in National Undergraduate Mathematical Modeling
  • [2017] Undergraduate National Scholarship at UESTC (Top 1%)
  • [2016] Undergraduate National Scholarship at UESTC (Top 1%)
  • [2015] Undergraduate National Scholarship at UESTC (Top 1%)