Yongchao Xu | 徐永超

I am currently a Ph.D. student (2025-2028) majoring in Electronic and Information Engineering at University of Science and Technology of China (USTC), supervised by Prof. Zheng-Jun Zha and Associate Prof. Jiawei Liu. Before that, I pursued my Master’s degree at University of Science and Technology of China (USTC) (2023-2025) and received my Bachelor's Degree from Northeastern University, majoring in Automation (2019-2023).

My research interests include Human-Object Interaction (HOI) detection, Multimodal Large Language Models, and Embodied Perception.

Email: yongchaoxu@mail.ustc.edu.cn

Google Scholar  /  Github  /  Email

profile photo
News

  • [2026.02]   🎉🎉  One paper on open-vocabulary HOI detection is accepted by CVPR 2026.
  • [2026.01]   🎉🎉  One paper on active prompt learning is accepted by ICLR 2026.
  • [2025.11]   🎉🎉  One paper on zero-shot HOI detection is accepted by IJCV 2026.
  • [2025.10]   🎉🎉  One paper on active prompt learning is accepted by IJCV 2026.
  • [2025.07]   🎉🎉  One paper is accepted by MM 2025 Workshop (3rd Place).
  • [2025.02]   🎉🎉  One paper on multi-task test-time adaptation is accepted by CVPR 2025.
  • [2024.12]   🎉🎉  One paper on HOI detection is accepted by AAAI 2025.
  • [2024.07]   🎉🎉  One paper is accepted by MM 2024 Workshop (2nd Place).
  • [2024.01]   🎉🎉  One paper on multi-modal fact-checking is accepted by WWW 2024.
  • Publications
    Learning to Diversify and Focus: A Reinforcement Framework for Open-Vocabulary HOI Detection
    Yongchao Xu, Jiawei Liu, Junfeng Wang, Sen Tao, Na Jiang, Zheng-Jun Zha
    Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR2026) , 2026
    PDF     Code

    We propose a semantic-diversified and interaction-focused framework for open-vocabulary HOI detection, which enhances generalization to unseen interactions while capturing more discriminative and spatially interpretable interaction cues.

    PSP: Prompt-Guided Self-Training Sampling Policy for Active Prompt Learning
    Sen Tao, Kaiduo Feng, Jiawei Liu, Peng Zeng, Yongchao Xu, Yufei Zheng, Zheng-Jun Zha
    International Conference on Learning Representations (ICLR2026) , 2026
    PDF     Code

    We propose a prompt-guided self-training sampling policy for active prompt learning, which improves sample selection by integrating soft actor-critic optimization with real-pseudo hybrid rewards and vectorized critics.

    Mamba-Driven Comprehensive Context Learning for Zero-Shot HOI Detection
    Jiawei Liu, Yongchao Xu (co-first author), Sen Tao, Yuexuan Qi, Zheng-Jun Zha
    International Journal of Computer Vision (IJCV2026) , 2026
    PDF     Code

    We propose a comprehensive context learning framework for zero-shot HOI detection, which improves recognition by collaboratively modeling semantic and spatial interactions.

    Boosting Active Prompt Learning via Discriminative Self-Training Dual-Curriculum Learning
    Sen Tao, Jiawei Liu, Peng Zeng, Yongchao Xu, Bingyu Hu, Zheng-Jun Zha
    International Journal of Computer Vision (IJCV2026) , 2026
    PDF     Code

    We propose a dual-curriculum active prompt learning framework that improves adaptation by progressively selecting informative samples and leveraging reliable pseudo-labeled data.

    HOIMamba: Effcient Mamba-based Disentangled Progressive learning for HOI Detection
    Yongchao Xu, Jiawei Liu, Sen Tao, Qiang Zhang, Zheng-Jun Zha
    The Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI2025) , 2025
    PDF     Code

    We introduce a novel HOI detection framework with a well-designed Mamba-based decoder to mine the benefts of existing methods and enhance the ability to recognize diffcult HOI samples.

    Hierarchical Knowledge Prompt Tuning for Multi-task Test-Time Adaptation
    Qiang Zhang, Mengsheng Zhao, Jiawei Liu, Fanrui Zhang, Yongchao Xu, Zheng-Jun Zha
    Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR2025) , 2025
    PDF     Code

    We propose the frst method to address the issue of multi-task test-time adaptation for pre-trained VLMs. It can also seamlessly migrate to basic single-task scenarios.

    ESCNet: Entity-enhanced and Stance Checking Network for Multi-modal Fact-Checking
    Fanrui Zhang, Jiawei Liu, Jingyi Xie, Qiang Zhang, Yongchao Xu, Zheng-Jun Zha
    International World Wide Web Conference (WWW2024) , 2024
    PDF     Code

    We establish the first large-scale, multi-domain Chinese multi-modal fact-checking dataset, encompassing all types of misinfomation in the multi-modal fact-checking task.

    Awards

  • 2025.09: First-Class Scholarship of USTC (中科大一等学业奖学金) .
  • 2024.09: First-Class Scholarship of USTC (中科大一等学业奖学金) .
  • 2023.09: First-Class Scholarship of USTC (中科大一等学业奖学金) .
  • 2023.06: Outstanding Graduates of Liaoning Province (辽宁省优秀毕业生,Top 1%) .
  • 2023.05: Principal's Medal of NEU (东北大学校长奖章,Top 10 of all enrolled students) .
  • 2022.09: First-Class Scholarship of NEU (东北大学一等奖学金,Top 3%).
  • 2022.05: Outstanding Winner & AMS Award (美赛O奖&AMS奖,Top 0.02%) of the International Collegiate Mathematical Modeling Competition.
  • 2020.09: National Scholarship of China (国家奖学金,Top 0.2%) ; First-Class Scholarship of NEU (东北大学一等奖学金,Top 3%).


  • The template of this page is from Jon Barron.