Ziyang Gong
portrait

Ziyang Gong (龚子洋)

PhD Student School of Computer Science, Shanghai Jiao Tong University

Welcome. I am Ziyang Gong, a 1st-year Ph.D. student at School of Computer Science, Shanghai Jiao Tong University, co-supervised by Asst. Prof. Xue Yang and Prof. Junchi Yan (IAPR Fellow, ACM-MM26 PC, and ICML Board). I am currently working at ACE Robotics, where I focus on Embodied Spatial Intelligence, driven by the ambition to construct a next-generation cross-embodiment foundation brain.

Before starting my Ph.D., I also worked at OpenGVLab, Shanghai AI Laboratory, where I was mentored by Asst. Prof. Xue Yang and Dr. Gen Luo, contributing to research at the intersection of embodied vision and foundation models.

I believe the future of AI not only lies in building larger models, but also in building systems that truly understand and operate in space. If you share similar curiosity about Embodied Spatial Intelligence, welcome to connect with me to explore collaborations.

SJTU
Ph.D.
ACE Robotics
Internship
SenseTime
Internship
Tsinghua University
Internship
Shanghai AI Lab
Internship
Sun Yat-sen University
M.E.

News & Updates

Publications

ACE-Brain-0 thumbnail
ACE-Brain-0: Spatial Intelligence as a Shared Scaffold for Universal Embodiments
Ziyang Gong*,Zehang Luo*,Anke Tang*,Zhe Liu*,Shi Fu,Zhi Hou,Ganlin Yang,Weiyun Wang,Xiaofeng Wang,Jianbo Liu,Gen Luo,Haolan Kang,Shuang Luo,Yue Zhou,Yong Luo,Li Shen,Xiaosong Jia,Yao Mu,Xue Yang,Chunxiao Liu,Junchi Yan,Hengshuang Zhao,Dacheng Tao,Xiaogang Wang
Technical Report
Embodied Intelligence Spatial Intelligence
VeBrain thumbnail
Visual embodied brain: Let multimodal large language models see, think, and control in spaces
Gen Luo*, Ganlin Yang*, Ziyang Gong*, Guanzhou Chen, Haonan Duan, Erfei Cui, Ronglei Tong, Zhi Hou, Tianyi Zhang, Zhe Chen, Shenglong Ye, Lewei Lu, Jingbo Wang, Wenhai Wang, Jifeng Dai, Yu Qiao, Rongrong Ji, Xizhou Zhu
Technical Report
Embodied Intelligence Spatial Intelligence
Space-10 thumbnail
Space-10: A comprehensive benchmark for multimodal large language models in compositional spatial intelligence
Ziyang Gong*, Wenhao Li*, Oliver Ma*, Songyuan Li, Zhaokai Wang, Jiayi Ji, Xue Yang, Gen Luo, Junchi Yan, Rongrong Ji
ICLR 2026
Spatial Intelligence Benchmark
Interleave-vla thumbnail
Interleave-vla: Enhancing robot manipulation with interleaved image-text instructions
Cunxin Fan, Xiaosong Jia, Yihang Sun, Yixiao Wang, Jianglan Wei, Ziyang Gong, Xiangyu Zhao, Masayoshi Tomizuka, Xue Yang, Junchi Yan, Mingyu Ding
ICLR 2026
Embodied Intelligence Vision-Language Action Model
IGen thumbnail
IGen: Scalable Data Generation for Robot Learning from Open-World Images
Chenghao Gu, Haolan Kang, Junchao Lin, Jinghe Wang, Duo Wu, Shuzhao Xie, Fanding Huang, Junchen Ge, Ziyang Gong, Letian Li, Hongying Zheng, Changwei Lv, Zhi Wang
CVPR 2026
Embodied Intelligence Embodied Vision Data Generation
Robotic visual instruction thumbnail
Robotic visual instruction
Yanbang Li, Ziyang Gong, Haoyang Li, Xiaoqi Huang, Haolan Kang, Guangping Bai, Xianzheng Ma
CVPR 2025
Embodied Intelligence Embodied Vision
Crossearth thumbnail
CrossEarth: Geospatial vision foundation model for domain generalizable remote sensing semantic segmentation
Ziyang Gong*, Zhixiang Wei*, Di Wang*, Xiaoxing Hu*, Xianzheng Ma, Hongruixuan Chen, Yuru Jia, Yupeng Deng, Zhenming Ji, Xiangwei Zhu, Xue Yang, Naoto Yokoya, Jing Zhang, Bo Du, Junchi Yan, Liangpei Zhang
TPAMI 2025
Low-Altitude Sensing Remote Sensing Foundation Model
Crossearth-gate thumbnail
CrossEarth-Gate: Geospatial vision foundation model for domain generalizable remote sensing semantic segmentation
Shilei Cao*, Ziyang Gong*, Hehai Lin, Yang Liu, Jiashun Cheng, Xiaoxing Hu, Haoyuan Liang, Guowen Li, Chengwei Qin, Hong Cheng, Xue Yang, Juepeng Zheng, Haohuan Fu
CVPR 2026
Low-Altitude Sensing Remote Sensing PEFT
diffusionsat thumbnail
Can Generative Geospatial Diffusion Models Excel as Discriminative Geospatial Foundation Models?
Yuru Jia, Valerio Marsocci, Ziyang Gong, Xue Yang, Maarten Vergauwen, Andrea Nascetti
ICCV 2025
Low-Altitude Sensing Remote Sensing Foundation Model
coda thumbnail
Coda: Instructive chain-of-domain adaptation with severity-aware visual prompt tuning
Ziyang Gong, Fuhao Li, Yupeng Deng, Deblina Bhattacharjee, Xianzheng Ma, Xiangwei Zhu, Zhenming Ji
ECCV 2024
Autonomous Driving Domain Adaptation Computer Vision

Visitors

Visitors Map