Hao Wang (王豪)

Computer Vision · Multimodal AI · Agentic AI

Hao Wang (王豪)

I'm currently a Ph.D. candidate at HCP Lab, SYSU, and Pengcheng Laboratory, supervised by Prof. Xiaodan Liang and Associate Prof. Xiangyuan Lan. Before that, I received my Master's degree from UCAS and CASIA, supervised by Prof. Jing Liu, and my Bachelor's degree from BJTU.

My research focuses on open-ended visual perception, multimodal foundation models, and agentic AI, with first-author work on unified image/video segmentation, open-vocabulary detection, and temporal pixel-level understanding.

Research interests

Open-ended visual perception Multimodal foundation models Image/video segmentation Agentic AI

Leave the world a little different.
I will graduate in December 2026 and am actively seeking research positions in industry. I am also open to collaborations on innovative projects. If you have suitable opportunities or are interested in collaborating, please feel free to contact me via email or WeChat.

CV 简历 E-mail WeChat Scholar GitHub

News

2026.06 LatestHappy to announce that our paper X2SAM has been accepted by ECCV 2026!

2026.04 Excited to share that our latest project X2SAM is officially released!

2025.11 Happy to announce that our paper X-SAM has been accepted by AAAI 2026!

2025.08 Excited to share that our project X-SAM is officially released!

2024.07 Excited to share that our project OV-DINO is officially released!

Publications

First-author Co-author

First-author Publications
2026
	X2SAM: Any Segmentation in Images and VideosECCV 2026 Hao Wang, Limeng Qiao, Chi Zhang, Guanglu Wan, Lin Ma, Xiangyuan Lan, Xiaodan Liang, ECCV, 2026 Project Paper Code A novel unified segmentation MLLM that extends any-segmentation from images to videos, supporting conversational instructions and visual prompts through Mask Memory for temporally consistent pixel-level perception.
2025
	X-SAM: From Segment Anything to Any SegmentationAAAI 2026 Hao Wang, Limeng Qiao, Zequn Jie, Zhijian Huang, Chengjian Feng, Qingfang Zheng, Lin Ma, Xiangyuan Lan, Xiaodan Liang, AAAI, 2026 Project Paper Code A unified multimodal large language model framework that extends segment-anything capabilities to any segmentation, enabling instruction-driven interaction and pixel-level perceptual understanding.
2024
	OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion Hao Wang, Pengzhen Ren, Zequn Jie, Xiao Dong, Chengjian Feng, Yinlong Qian, Lin Ma, Dongmei Jiang, Yaowei Wang, Xiangyuan Lan, Xiaodan Liang, arXiv Preprint, 2024 Project Paper Code A unified open-vocabulary detector pre-trained on diverse large-scale datasets with language-aware selective fusion, improving category generalization through visual-language alignment.
2021
	TMANet: Temporal Memory Attention for Video Semantic SegmentationICIP 2021 Hao Wang, Weining Wang, Jing Liu, ICIP, 2021 Paper Code A temporal memory attention method for long-range video semantic segmentation that captures cross-frame relations without optical-flow prediction.
Co-author Publications
2023
	WL-MSR: Watch and Listen for Multimodal Subtitle RecognitionICASSP 2023 Jiawei Liu, Hao Wang, Weining Wang, Xingjian He, Jing Liu, ICASSP, 2023 Paper A framework that fuses OCR and ASR information using a Transformer model with mask/crop strategies and multi-level identity embeddings to generate comprehensive video subtitles.

Experience

2025.01 - Present

Research intern in Meituan M17-MM, developing unified any-segmentation MLLMs for images and videos with Limeng Qiao, Lin Ma and Guanglu Wan.

2022.07 - 2025.01

Research intern in Meituan Vision Intelligence Department, working on open-vocabulary recognition and any segmentation with Zequn Jie and Lin Ma.

2021.05 - 2021.08

Applied research intern in Tencent AI Platform Department.

2019.09 - 2020.07

Applied project intern in Huawei Photo Processing Department.

Education

2022.09 - Present

Ph.D. student in the School of Intelligent Systems Engineering, SYSU, and Pengcheng Laboratory, co-supervised by Prof. Xiaodan Liang and Associate Prof. Xiangyuan Lan.

2019.09 - 2022.06

Master student in the School of Artificial Intelligence, UCAS, and the Institute of Automation, CAS, supervised by Prof. Jing Liu.

2015.09 - 2019.06

Bachelor student in the School of Electronic and Information Engineering, BJTU.

Services

Conference Reviewer AAAI 2026 ICCV 2023 ECCV 2024

Journal Reviewer Proceedings of the IEEE

Awards

2021.09 1st place in the 1st VSPW Challenge Workshop, ICCV 2021.

News

Publications

First-author Publications

Co-author Publications

Experience

Education

Services

Awards

Visitors