Yongchao Xu | Homepage

Yongchao Xu | 徐永超

I am currently a Ph.D. student (2025-2028) majoring in Electronic and Information Engineering at University of Science and Technology of China (USTC), supervised by Prof. Zheng-Jun Zha and Associate Prof. Jiawei Liu. Before that, I pursued my Master’s degree at University of Science and Technology of China (USTC) (2023-2025) and received my Bachelor's Degree from Northeastern University, majoring in Automation (2019-2023).

My research interests include Multimodal Large Models, Multimodal Agents, and Agentic Reinforcement Learning (Agentic RL).

Email: yongchaoxu@mail.ustc.edu.cn

Google Scholar / Github / Email

News

[2026.04] 🎉🎉 One paper on VLM-based class incremental learning (third author) is accepted by CVM 2026.

[2026.02] 🎉🎉 One paper on open-vocabulary HOI detection (first author) is accepted by CVPR 2026 (highlight).

[2026.01] 🎉🎉 One paper on active prompt learning (fifth author) is accepted by ICLR 2026.

[2025.11] 🎉🎉 One paper on zero-shot HOI detection (co-first author) is accepted by IJCV 2026.

[2025.10] 🎉🎉 One paper on active prompt learning (fourth author) is accepted by IJCV 2026.

[2025.07] 🎉🎉 One paper is accepted by MM 2025 Workshop (3rd Place).

[2025.02] 🎉🎉 One paper on multi-task test-time adaptation (fifth author) is accepted by CVPR 2025.

[2024.12] 🎉🎉 One paper on HOI detection (first author) is accepted by AAAI 2025.

[2024.07] 🎉🎉 One paper is accepted by MM 2024 Workshop (2nd Place).

[2024.01] 🎉🎉 One paper on multi-modal fact-checking (fifth author) is accepted by WWW 2024.

Publications

	Mining the Potential of Rehearsal Mechanism for VLM-based Class Incremental Learning Sen Tao, Jiawei Liu, Yongchao Xu, Guangxi Wan, Peng Zeng Chinese Conference on Computer Vision and Machine Intelligence (CVM2026) , 2026 PDF Code We propose a debiased memory-calibrated Gaussian discriminant analysis method for VLM-based class incremental learning, which mitigates dynamic class imbalance in rehearsal and improves long-term knowledge retention with an adaptive memory iteration strategy.
	Learning to Diversify and Focus: A Reinforcement Framework for Open-Vocabulary HOI Detection Yongchao Xu, Jiawei Liu, Junfeng Wang, Sen Tao, Na Jiang, Zheng-Jun Zha Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR2026 highlight) , 2026 PDF Supp Code We propose a semantic-diversified and interaction-focused framework for open-vocabulary HOI detection, which enhances generalization to unseen interactions while capturing more discriminative and spatially interpretable interaction cues.
	PSP: Prompt-Guided Self-Training Sampling Policy for Active Prompt Learning Sen Tao, Kaiduo Feng, Jiawei Liu, Peng Zeng, Yongchao Xu, Yufei Zheng, Zheng-Jun Zha International Conference on Learning Representations (ICLR2026) , 2026 PDF Code We propose a prompt-guided self-training sampling policy for active prompt learning, which improves sample selection by integrating soft actor-critic optimization with real-pseudo hybrid rewards and vectorized critics.
	Mamba-Driven Comprehensive Context Learning for Zero-Shot HOI Detection Jiawei Liu, Yongchao Xu (co-first author), Sen Tao, Yuexuan Qi, Zheng-Jun Zha International Journal of Computer Vision (IJCV2026) , 2026 PDF Code We propose a comprehensive context learning framework for zero-shot HOI detection, which improves recognition by collaboratively modeling semantic and spatial interactions.
	Boosting Active Prompt Learning via Discriminative Self-Training Dual-Curriculum Learning Sen Tao, Jiawei Liu, Peng Zeng, Yongchao Xu, Bingyu Hu, Zheng-Jun Zha International Journal of Computer Vision (IJCV2026) , 2026 PDF Code We propose a dual-curriculum active prompt learning framework that improves adaptation by progressively selecting informative samples and leveraging reliable pseudo-labeled data.
	HOIMamba: Effcient Mamba-based Disentangled Progressive learning for HOI Detection Yongchao Xu, Jiawei Liu, Sen Tao, Qiang Zhang, Zheng-Jun Zha The Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI2025) , 2025 PDF Code We introduce a novel HOI detection framework with a well-designed Mamba-based decoder to mine the benefts of existing methods and enhance the ability to recognize diffcult HOI samples.
	Hierarchical Knowledge Prompt Tuning for Multi-task Test-Time Adaptation Qiang Zhang, Mengsheng Zhao, Jiawei Liu, Fanrui Zhang, Yongchao Xu, Zheng-Jun Zha Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR2025) , 2025 PDF Code We propose the frst method to address the issue of multi-task test-time adaptation for pre-trained VLMs. It can also seamlessly migrate to basic single-task scenarios.
	ESCNet: Entity-enhanced and Stance Checking Network for Multi-modal Fact-Checking Fanrui Zhang, Jiawei Liu, Jingyi Xie, Qiang Zhang, Yongchao Xu, Zheng-Jun Zha International World Wide Web Conference (WWW2024) , 2024 PDF Code We establish the first large-scale, multi-domain Chinese multi-modal fact-checking dataset, encompassing all types of misinfomation in the multi-modal fact-checking task.