Hao Wang (王豪)

I'm currently a Ph.D. student in HCP Lab, SYSU, supervised by Prof. Xiaodan Liang and Prof. Xiangyuan Lan. Before that, I received my Master's Degree in CASIA, supervised by Prof. Jing Liu, and my Bachelor's Degree in BJTU.

My research interests include:

  • Open-ended computer vision
  • Multimodal large language models
Open-source promotes the development of technology.

I'm currently looking for collaborations, feel free to contact me via E-mail or WeChat.

Home  /  E-mail  /  WeChat  /  Scholar  /  Github

profile photo

✨ News

  • [2025.11] Happy to announce that our paper X-SAM is accepted by AAAI 2026.

📑 Publications

X-SAM: From Segment Anything to Any Segmentation
Hao Wang, Limeng Qiao, Zequn Jie, Zhijian Huang, Chengjian Feng, Qingfang Zheng, Lin Ma, Xiangyuan Lan, Xiaodan Liang,
AAAI, 2026
Project / Paper / GitHub stars

A unified multimodal large language model (MLLM) framework, extending the segmentation from segment anything to any segmentation, enhancing pixel-level perceptual understanding.

Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
Hao Wang, Pengzhen Ren, Zequn Jie, Xiao Dong, Chengjian Feng, Yinlong Qian, Lin Ma, Dongmei Jiang, Yaowei Wang, Xiangyuan Lan, Xiaodan Liang,
arXiv Preprint, 2024
Project / Paper / GitHub stars

A novel unified open-vocabulary detection method, which is pre-trained on diverse large-scale datasets with language-aware selective fusion in a unified framework.

TMANet: Temporal Memory Attention for Video Semantic Segmentation
Hao Wang, Weining Wang, Jing Liu,
ICIP, 2021
Project / Paper / GitHub stars

A novel self-attention mechanisms and temporal memory to capture long-range temporal relations between frames, avoiding the computational cost of optical flow prediction.

WL-MSR: Watch and Listen for Multimodal Subtitle Recognition
Jiawei Liu, Hao Wang, Weining Wang, Xingjian He, Jing Liu,
ICASSP, 2023
Paper

A framework that fuses OCR and ASR information using a Transformer model with mask/crop strategies and multi-level identity embeddings to generate comprehensive video subtitles.

💼 Experience

  • 2021.05 - 2021.08
    Application Research Intern in Tencet AI Platform Group, co-worked with Xinpeng Zhou and Mao Zheng.

  • 2019.09 - 2020.07
    Application Project Intern in Huawei Photo Processing Group.

🎓 Education

  • 2022.09 - Present
    Ph.D student in School of Intelligent Systems Engineering, SYSU, and Pengcheng Lab, co-supervised by Prof. Xiaodan Liang and Prof. Xiangyuan Lan.

  • 2019.09 - 2022.06
    Master student in School of Artificial Intelligence, UCAS, and Institute of Automation of CAS, supervised by Prof. Jing Liu.

  • 2015.09 - 2019.06
    Bachelor student in School of Electronic and Information Engineering, BJTU.

🌈 Services

  • Reviewer for conferences: AAAI2026, ICCV2023, ECCV2024.

🏆 Awards

Template is stolen from Jon Barron's website and modified by me. Feel free to steal this website's source code.

- views   |   - visitors