Shaoshuai Shi
Researcher @ AI Research, DiDi Autonomous Driving
|
I was a researcher at Max Planck Institute for Informatics (MPI-INF), working with Prof. Bernt Schiele. I obtained my Ph.D. degree from Multimedia Lab (MMLab) of The Chinese University of Hong Kong (CUHK), supervised by Prof. Xiaogang Wang and Prof. Hongsheng Li. Before that, I received my bachelor’s degree from the Computer Science Honor Class of Harbin Institute Technology (HIT).
My research interests focus on computer vision and machine learning, particularly in exploring machine learning for autonomous systems. This includes 3D scene understanding, object detection, motion prediction, knowledge transfer, and other topics related to autonomous driving and robotics.
I am currently building an AI research team at DiDi Autonomous Driving, dedicated to applying cutting-edge AI technologies in achieving L4-level self-driving cars.
[Hiring] We are seeking talents for both full-time positions and research internships throughout the year. If you are interested in working with us, please feel free to send me an email.
Champion of Waymo Open Dataset Motion Prediction challenge, 2024.
Champion of Waymo Open Dataset Motion Prediction challenge, 2023.
Achieved Excellence in Young Scientist Award of HKIS 2022 (top two selected in Engineering Science in Hong Kong)
Champion of Waymo Open Dataset Motion Prediction challenge, 2022.
World Artificial Intelligence Conference Rising Star Award (17 selected world-wide), 2021.
Google PhD Fellowship (10 selected world-wide in machine perception), 2020.
Hong Kong PhD Fellowship (The highest scholarship for PhD students in Hong Kong), 2017-2021
National Scholarship, 2014, 2015, 2016
OpenPCDet: An Open-source Toolbox for 3D Object Detection from Point Cloud
OpenPCDet Development Team June 2020 [Code] [Bibtex] |
*: Equal Contribution #: Corresponding Author
Publication List:
GiT: Towards Generalist Vision Transformer through Universal Language Interface
Haiyang Wang*, Hao Tang*, Li Jiang, Shaoshuai Shi, Muhammad Ferjad Naeem, Hongsheng Li, Bernt Schiele, Liwei Wang
European Conference on Computer Vision (ECCV), 2024.
Open-Vocabulary 3D Semantic Segmentation with Foundation Models
Li Jiang, Shaoshuai Shi, Bernt Schiele
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
MTR++: Multi-Agent Motion Prediction with Symmetric Scene Modeling and Guided Intention Querying
Shaoshuai Shi*, Li Jiang*#, Dengxin Dai, Bernt Schiele
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), accepted, 2024. (IF: 24.314)
Won the Championship of Waymo Open Dataset Motion Prediction Challenge 2022 (May, 2023).
UniTR: A Unified and Efficient Multi-Modal Transformer for Bird’s-Eye-View Representation
Haiyang Wang*, Hao Tang*, Shaoshuai Shi#, Aoxue Li, Zhenguo Li, Bernt Schiele, Liwei Wang#
International Conference on Computer Vision (ICCV), 2023.
[Code]
TrajectoryFormer: 3D Object Tracking Transformer with Predictive Trajectory Hypotheses
Xuesong Chen, Shaoshuai Shi#, Chao Zhang, Benjin Zhu, Qiang Wang, Ka Chun Cheung, Simon See, Hongsheng Li#
International Conference on Computer Vision (ICCV), 2023.
[Code]
CoIn: Contrastive Instance Feature Mining for Outdoor 3D Object Detection with Very Limited Annotations
Qiming Xia, Jinhao Deng, Chenglu Wen, Shaoshuai Shi, Xin Li, Cheng Wang
International Conference on Computer Vision (ICCV), 2023.
Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding
Li Jiang, Zetong Yang, Shaoshuai Shi, Vladislav Golyanik, Dengxin Dai, Bernt Schiele
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Virtual Sparse Convolution for Multimodal 3D Object Detection
Hai Wu, Chenglu Wen, Shaoshuai Shi, Xin Li, Cheng Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets
Haiyang Wang*, Chen Shi*, Shaoshuai Shi#, Meng Lei, Sen Wang, Di He, Bernt Schiele, Liwei Wang#
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
[Code]
ConQueR: Query Contrast Voxel-DETR for 3D Object Detection
Benjin Zhu, Zhe Wang, Shaoshuai Shi, Hang Xu, Lanqing Hong, Hongsheng Li
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
[Code]
3D Object Detection for Autonomous Driving: A Comprehensive Survey
Jiageng Mao, Shaoshuai Shi, Xiaogang Wang, Hongsheng Li
International Journal of Computer Vision (IJCV), accepted, 2023. (IF: 13.369)
Test Time Domain Adaptation for Monocular Depth Estimation
Zhi Li, Shaoshuai Shi, Bernt Schiele, Dengxin Dai
International Conference on Robotics and Automation (ICRA), 2023.
Motion Transformer with Global Intention Localization and Local Movement Refinement
Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele
Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), 2022. [Oral]
[Code]
Ranked 1st place on Waymo motion prediction and interaction prediction two leaderboards (May, 2022).
CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds
Haiyang Wang*, Lihe Ding*, Shaocong Dong, Shaoshuai Shi#, Aoxue Li, Jianan Li, Zhenguo Li, Liwei Wang#
Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), 2022.
[Code]
Towards Efficient 3D Object Detection with Knowledge Distillation
Jihan Yang, Shaoshuai Shi, Runyu Ding, Zhe Wang, Xiaojuan Qi
Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), 2022.
[Code]
MTR-A: 1st Place Solution for 2022 Waymo Open Dataset Challenge – Motion Prediction
Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele
Technical report of 1st place solution to Waymo Motion Prediction Challenge at Workshop on Autonomous Driving of CVPR 2022 (CVPRW), 2022.
Won the Championship of Waymo Open Dataset Motion Prediction Challenge 2022 (June, 2022).
[Code]
MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection
Xuesong Chen*, Shaoshuai Shi*#, Benjin Zhu, Ka Chun Cheung, Hang Xu, Hongsheng Li#
European Conference on Computer Vision (ECCV), 2022.
[Code]
Ranked 1st place on Waymo 3D object detection leaderboard (June, 2022).
RBGNet: Ray-based Grouping for 3D Object Detection
Haiyang Wang, Shaoshuai Shi# , Ze Yang, Rongyao Fang, Qi Qian, Hongsheng Li, Bernt Schiele, Liwei Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
[Code]
ST3D++: Self-training for Unsupervised Domain Adaptation on 3D Object Detection
Jihan Yang, Shaoshuai Shi, Zhe Wang, Hongsheng Li, Xiaojuan Qi
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), accepted, 2022. (IF: 24.314)
[Code]
PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection
Shaoshuai Shi, Li Jiang, Jiajun Deng, Zhe Wang, Chaoxu Guo, Jianping Shi, Xiaogang Wang, Hongsheng Li
International Journal of Computer Vision (IJCV), accepted, 2022. (IF: 13.369)
[Code]
LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detector
Xiaoyang Guo, Shaoshuai Shi, Xiaogang Wang, Hongsheng Li
International Conference on Computer Vision (ICCV), 2021.
[Code]
Guided Point Contrastive Learning for Semi-Supervised Point Cloud Semantic Segmentation
Li Jiang, Shaoshuai Shi, Zhuotao Tian, Xin Lai, Shu Liu, Chi-Wing Fu, Jiaya Jia
International Conference on Computer Vision (ICCV), 2021.
[Code]
ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection
Jihan Yang*, Shaoshuai Shi*, Zhe Wang, Hongsheng Li, Xiaojuan Qi
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
[Code]
Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds
Bowen Cheng, Lu Sheng, Shaoshuai Shi, Ming Yang, Dong Xu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
[Code]
Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection
Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, Houqiang Li
AAAI Conference on Artificial Intelligence (AAAI), 2021.
[Code]
The Top-Performing LiDAR-only Solutions for 3D Detection / 3D Tracking / Domain Adaptation of Waymo Open Dataset Challenges
Shaoshuai Shi, Chaoxu Guo, Jihan Yang, Hongsheng Li
Technical report of top-performing LiDAR-only solutions to Waymo Open Dataset Challenges at Workshop on Autonomous Driving of CVPR 2020 (CVPRW), 2020.
[Code]
Won 1st place among all LiDAR-only methods / 2nd place among all methods on 3D detection, 3D tracking and domain adaptation three tracks of the Waymo Open Dataset Challenges 2020 (June, 2020).
PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
Shaoshuai Shi, Chaoxu Guo, Li Jiang, Zhe Wang, Jianping Shi, Xiaogang Wang, Hongsheng Li
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[Code]
Ranked 1st place on KITTI 3D object detection benchmark (Car, Nov 2019 - Aug 2020).
PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[Oral]
[Code]
Ranked 1st place on ScanNet 3D Semantic instance benchmark (Nov-16 2019).
From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network
Shaoshuai Shi, Zhe Wang, Jianping Shi, Xiaogang Wang, Hongsheng Li
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), accepted, 2020. (IF: 24.314)
[Code]
ESI Hot and Highly Cited Paper (top 0.1%).
Ranked 1st place on KITTI 3D object detection benchmark (Car, July-9 2019).
SegVoxelNet: Exploring Semantic Context and Depth-aware Features for 3D Vehicle Detection from Point Cloud
Hongwei Yi, Shaoshuai Shi, Mingyu Ding, Jiankai Sun, Kui Xu, Hui Zhou, Zhe Wang, Sheng Li, Guoping Wang
International Conference on Robotics and Automation (ICRA), 2020.
PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud
Shaoshuai Shi, Xiaogang Wang, Hongsheng Li
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
[Code-v1] [Code-v2]
The top-10 cited papers among all CVPR-2019 papers (March, 2021), refer to here.
Feature Intertwiner for Object Detection
Hongyang Li, Bo Dai, Shaoshuai Shi, Wanli Ouyang, Xiaogang Wang
International Conference on Learning Representation (ICLR), 2019.
GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction
Li Jiang, Shaoshuai Shi, Xiaojuan Qi, Jiaya Jia
European Conference on Computer Vision (ECCV), 2018. [Oral]
FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates
Yijin Guan, Hao Liang, Ningyi Xu, Wenqiang Wang, Shaoshuai Shi, Xi Chen, Guangyu Sun, Wei Zhang, Jason Cong
IEEE Field-Programmable Custom Computing Machines (FCCM), 2017.
Postdoc Researcher, Oct 2021 - Nov 2023, Max Planck Institute for Informatics, Germany.
Advised by Prof. Bernt Schiele.
Research Intern, Mar 2019 - Feb 2021, Autonomous Driving Group of SenseTime, Shenzhen, China.
Working with Dr. Zhe Wang and Dr. Jianping Shi.
Research Intern, July 2016 - July 2017, System Group of Microsoft Research Asia (MSRA), Beijing, China.
Advised by Prof. Ningyi Xu.
Conference Reviewer:
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
IEEE International Conference on Computer Vision (ICCV)
European Conference on Computer Vision (ECCV)
Conference on Neural Information Processing Systems (NeurIPS)
AAAI Conference on Artificial Intelligence (AAAI)
IEEE International Conference on Intelligent Robots and Systems (IROS)
IEEE International Conference on Robotics and Automation (ICRA)
Journal Reviewer:
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
International Journal of Computer Vision (IJCV)
IEEE Transactions on Image Processing (TIP)
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
IEEE Transactions on Robotics (TR)
IEEE Transactions on Multimedia (TMM)
IEEE Transactions on Intelligent Transportation Systems (TITS)
IEEE Robotics and Automation Letters (RA-L)
IEEE Sensors Journal
Neurocomputing
Teaching Assistant of the following courses: