Jianjie Luo

News

NEW! (Always) I am looking for self-motivated Undergraduate and Graduate students. Feel free to contact me for research guidance or supervision.

NEW! (Mar. 2026) One paper accepted to ICME 2026.

NEW! (Nov. 2025) One paper accepted to MMM 2026.

NEW! (Aug. 2025) One paper accepted to ACM MULTIMEDIA 2025.

NEW! (Jul. 2025) One paper accepted to ECAI 2025.

Bio

I am currently a Lecturer in the School of Computer Science and Technology at Guangdong University of Technology (GDUT). Before that, I obtained my computer science Ph.D. degree in the joint doctoral program between Sun Yat-sen University and JD.COM in 2024, supervised by Prof. Hongyang Chao, Prof. Jianlin Feng and Dr. Tao Mei. During my doctoral studies, I had the privilege of being mentored by Dr. Ting Yao, and collaborated closely with Dr. Yingwei Pan, Dr. Yehao Li and Dr. Jingwen Chen on various research projects.

Research Interests

Computer Vision
Multimodal Learning
Multimedia Analysis

Education

08/2019 - 12/2024, Sun Yat-Sen University (SYSU)

Ph.D. in Computer Science and Technology
Joint Ph.D. Program with JD.com
Thesis topic: Describing Multimedia with Semantic Alignment

08/2015 - 07/2019, Sun Yat-sen University (SYSU)

B.Eng. in Software Engineering
Recipient of the National Scholarship Award, Outstanding Undergraduate Award

Experiences

01/2025 - Present, Guangdong University of Technology (GDUT), Guangzhou (Canton)

Lecturer @ School of Computer Science and Technology

03/2024 - 09/2024, HiDream.ai Inc., Beijing

Research Assistant
Mentor: Ting Yao

07/2020 - 05/2023, Computer Vision and Multimedia Lab at JD Explore Academy, Beijing

Research Intern (Star Intern Award)
Mentor: Ting Yao

07/2018 - 08/2019, Computer Vision and Multimedia Lab at JD AI Research, Beijing

Research Intern (Star Intern Award)
Mentor: Ting Yao

03/2018 - 06/2018, Pixtalks Technology, Guangzhou (Canton)

Research Intern
Mentor: Shengyong Ding

Selected Publications [Google Scholar]

(✉ refers to Corresponding Author)

2026

Boosting Knowledge-based Visual Question Answering with Structured Context Reasoning

Qiyou Liu, Yong Zhang, Jianjie Luo✉, Zhenguo Yang, Yi Yu

In ICME, 2026.

PDF

2025

Improving Identity Preservation in Video Generation with Multi-Branch Models

Jiahao Xu, Jianjie Luo✉, Zhenguo Yang

In ACM Multimedia, 2025.

PDF

Multi-Perspective Frequency Domain Learning for Generalizable AI-Generated Image Detection

Zili Xu, Jianjie Luo✉, Fuqiang Yu, Zhenguo Yang

In ECAI, 2025.

PDF

Exploring Vision-Language Foundation Model for Novel Object Captioning

Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Jianlin Feng, Hongyang Chao, Tao Mei

In IEEE Transactions on Circuits and Systems for Video Technology, 2025.

PDF

2024

Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning

Jianjie Luo, Jingwen Chen, Yehao Li, Yingwei Pan, Jianlin Feng, Hongyang Chao, Ting Yao

In ECCV, 2024.

PDF Website

2023

Semantic-Conditional Diffusion Networks for Image Captioning

Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Jianlin Feng, Hongyang Chao, Tao Mei

In CVPR, 2023.

PDF Code

Boosting Vision-and-Language Navigation with Direction Guiding and Backtracing

Jingwen Chen, Jianjie Luo, Yingwei Pan, Yehao Li, Ting Yao, Hongyang Chao, Tao Mei

In ACM Transactions on Multimedia Computing, Communications, and Applications, 2023.

PDF

2022

Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training

Yingwei Pan, Yehao Li, Jianjie Luo, Jun Xu, Ting Yao, Tao Mei

In ACM Multimedia, 2022.