Shaoshuai Shi

Researcher @ AI Research, DiDi Autonomous Driving

Email: shaoshuaics [at] gmail [dot] com

About Me

I was a researcher at Max Planck Institute for Informatics (MPI-INF), working with Prof. Bernt Schiele. I obtained my Ph.D. degree from Multimedia Lab (MMLab) of The Chinese University of Hong Kong (CUHK), supervised by Prof. Xiaogang Wang and Prof. Hongsheng Li. Before that, I received my bachelor’s degree from the Computer Science Honor Class of Harbin Institute Technology (HIT).

My research interests focus on computer vision and machine learning, particularly in exploring machine learning for autonomous systems. This includes 3D scene understanding, object detection, motion prediction, knowledge transfer, and other topics related to autonomous driving and robotics.

I am currently building an AI research team at DiDi Autonomous Driving, dedicated to applying cutting-edge AI technologies in achieving L4-level self-driving cars.

[Hiring] We are seeking talents for both full-time positions and research internships throughout the year. If you are interested in working with us, please feel free to send me an email.

News

[2024/06] Our approach MTR v3 won the championship of Waymo Open Dataset Motion Prediction Challenge 2024. Please refer to our technical report MTR v3 for more details.
[2024/01] One paper (MTR++) accepted by IEEE TPAMI.
[2023/07] Three papers accepted by ICCV 2023.
[2023/05] Our approach MTR++ won the championship of Waymo Open Dataset Motion Prediction Challenge 2023. Please refer to our paper MTR++ for more details.
[2023/02] Four papers accepted by CVPR 2023.
[2023/02] We release the MTR codebase, which can achieve SoTA results on Waymo Motion Prediction dataset.
[2022/12] Achieved Excellence in Young Scientist Award of HKIS 2022.
[2022/10] One apper (PV-RCNN++) accepted by IJCV.
[2022/09] Three papers with one oral presentation accepted by NeurIPS 2022.
[2022/09] One paper (ST3D++) accepted by IEEE TPAMI.
[2022/09] The source code of MPPNet has been released to OpenPCDet, ranking 1st place on Waymo 3D object detection leaderboard.
[2022/07] One paper accepted by ECCV 2022.
[2022/06] Champion of Waymo Open Dataset Motion Prediction challenge 2022.
[2022/03] One paper accepted by CVPR 2022.
[2021/07] Two papers accepted by ICCV 2021.
[2021/07] I was awarded World Artificial Intelligence Conference Rising Star Award, see the news from WAIC.
[2021/03] Two papers accepted by CVPR 2021.
[2020/10] I was awarded Google PhD Fellowship 2020, see the news from Google AI Blog.

Awards

Champion of Waymo Open Dataset Motion Prediction challenge, 2024.
Champion of Waymo Open Dataset Motion Prediction challenge, 2023.
Achieved Excellence in Young Scientist Award of HKIS 2022 (top two selected in Engineering Science in Hong Kong)
Champion of Waymo Open Dataset Motion Prediction challenge, 2022.
World Artificial Intelligence Conference Rising Star Award (17 selected world-wide), 2021.
Google PhD Fellowship (10 selected world-wide in machine perception), 2020.
Hong Kong PhD Fellowship (The highest scholarship for PhD students in Hong Kong), 2017-2021
National Scholarship, 2014, 2015, 2016

General Codebase

OpenPCDet: An Open-source Toolbox for 3D Object Detection from Point Cloud
OpenPCDet Development Team
June 2020
[Code] [Bibtex]


@misc{openpcdet2020,
    title={OpenPCDet: An Open-source Toolbox for 3D Object Detection from Point Clouds},
    author={OpenPCDet Development Team},
    howpublished = {\url{https://github.com/open-mmlab/OpenPCDet}},
    year={2020}
}

Publications

*: Equal Contribution #: Corresponding Author

Publication List:

GiT: Towards Generalist Vision Transformer through Universal Language Interface
Haiyang Wang*, Hao Tang*, Li Jiang, Shaoshuai Shi, Muhammad Ferjad Naeem, Hongsheng Li, Bernt Schiele, Liwei Wang
European Conference on Computer Vision (ECCV), 2024.
Open-Vocabulary 3D Semantic Segmentation with Foundation Models
Li Jiang, Shaoshuai Shi, Bernt Schiele
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
MTR++: Multi-Agent Motion Prediction with Symmetric Scene Modeling and Guided Intention Querying
Shaoshuai Shi*, Li Jiang*#, Dengxin Dai, Bernt Schiele
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), accepted, 2024. (IF: 24.314) Won the Championship of Waymo Open Dataset Motion Prediction Challenge 2022 (May, 2023).
UniTR: A Unified and Efficient Multi-Modal Transformer for Bird’s-Eye-View Representation
Haiyang Wang*, Hao Tang*, Shaoshuai Shi#, Aoxue Li, Zhenguo Li, Bernt Schiele, Liwei Wang#
International Conference on Computer Vision (ICCV), 2023. [Code]
TrajectoryFormer: 3D Object Tracking Transformer with Predictive Trajectory Hypotheses
Xuesong Chen, Shaoshuai Shi#, Chao Zhang, Benjin Zhu, Qiang Wang, Ka Chun Cheung, Simon See, Hongsheng Li#
International Conference on Computer Vision (ICCV), 2023. [Code]
CoIn: Contrastive Instance Feature Mining for Outdoor 3D Object Detection with Very Limited Annotations
Qiming Xia, Jinhao Deng, Chenglu Wen, Shaoshuai Shi, Xin Li, Cheng Wang
International Conference on Computer Vision (ICCV), 2023.
Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding
Li Jiang, Zetong Yang, Shaoshuai Shi, Vladislav Golyanik, Dengxin Dai, Bernt Schiele
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Virtual Sparse Convolution for Multimodal 3D Object Detection
Hai Wu, Chenglu Wen, Shaoshuai Shi, Xin Li, Cheng Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets
Haiyang Wang*, Chen Shi*, Shaoshuai Shi#, Meng Lei, Sen Wang, Di He, Bernt Schiele, Liwei Wang#
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023. [Code]
ConQueR: Query Contrast Voxel-DETR for 3D Object Detection
Benjin Zhu, Zhe Wang, Shaoshuai Shi, Hang Xu, Lanqing Hong, Hongsheng Li
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023. [Code]
3D Object Detection for Autonomous Driving: A Comprehensive Survey
Jiageng Mao, Shaoshuai Shi, Xiaogang Wang, Hongsheng Li
International Journal of Computer Vision (IJCV), accepted, 2023. (IF: 13.369)
Test Time Domain Adaptation for Monocular Depth Estimation
Zhi Li, Shaoshuai Shi, Bernt Schiele, Dengxin Dai
International Conference on Robotics and Automation (ICRA), 2023.

Motion Transformer with Global Intention Localization and Local Movement Refinement
Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele
Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), 2022. [Oral] [Code]
Ranked 1st place on Waymo motion prediction and interaction prediction two leaderboards (May, 2022).
CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds
Haiyang Wang*, Lihe Ding*, Shaocong Dong, Shaoshuai Shi#, Aoxue Li, Jianan Li, Zhenguo Li, Liwei Wang#
Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), 2022. [Code]
Towards Efficient 3D Object Detection with Knowledge Distillation
Jihan Yang, Shaoshuai Shi, Runyu Ding, Zhe Wang, Xiaojuan Qi
Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), 2022. [Code]
MTR-A: 1st Place Solution for 2022 Waymo Open Dataset Challenge – Motion Prediction
Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele
Technical report of 1st place solution to Waymo Motion Prediction Challenge at Workshop on Autonomous Driving of CVPR 2022 (CVPRW), 2022.
Won the Championship of Waymo Open Dataset Motion Prediction Challenge 2022 (June, 2022). [Code]
MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection
Xuesong Chen*, Shaoshuai Shi*#, Benjin Zhu, Ka Chun Cheung, Hang Xu, Hongsheng Li#
European Conference on Computer Vision (ECCV), 2022. [Code]
Ranked 1st place on Waymo 3D object detection leaderboard (June, 2022).
RBGNet: Ray-based Grouping for 3D Object Detection
Haiyang Wang, Shaoshuai Shi# , Ze Yang, Rongyao Fang, Qi Qian, Hongsheng Li, Bernt Schiele, Liwei Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022. [Code]
ST3D++: Self-training for Unsupervised Domain Adaptation on 3D Object Detection
Jihan Yang, Shaoshuai Shi, Zhe Wang, Hongsheng Li, Xiaojuan Qi
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), accepted, 2022. (IF: 24.314) [Code]

PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection
Shaoshuai Shi, Li Jiang, Jiajun Deng, Zhe Wang, Chaoxu Guo, Jianping Shi, Xiaogang Wang, Hongsheng Li
International Journal of Computer Vision (IJCV), accepted, 2022. (IF: 13.369) [Code]
LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detector
Xiaoyang Guo, Shaoshuai Shi, Xiaogang Wang, Hongsheng Li
International Conference on Computer Vision (ICCV), 2021. [Code]
Guided Point Contrastive Learning for Semi-Supervised Point Cloud Semantic Segmentation
Li Jiang, Shaoshuai Shi, Zhuotao Tian, Xin Lai, Shu Liu, Chi-Wing Fu, Jiaya Jia
International Conference on Computer Vision (ICCV), 2021. [Code]
ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection
Jihan Yang*, Shaoshuai Shi*, Zhe Wang, Hongsheng Li, Xiaojuan Qi
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021. [Code]
Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds
Bowen Cheng, Lu Sheng, Shaoshuai Shi, Ming Yang, Dong Xu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021. [Code]
Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection
Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, Houqiang Li
AAAI Conference on Artiﬁcial Intelligence (AAAI), 2021. [Code]
The Top-Performing LiDAR-only Solutions for 3D Detection / 3D Tracking / Domain Adaptation of Waymo Open Dataset Challenges
Shaoshuai Shi, Chaoxu Guo, Jihan Yang, Hongsheng Li
Technical report of top-performing LiDAR-only solutions to Waymo Open Dataset Challenges at Workshop on Autonomous Driving of CVPR 2020 (CVPRW), 2020. [Code]
Won 1st place among all LiDAR-only methods / 2nd place among all methods on 3D detection, 3D tracking and domain adaptation three tracks of the Waymo Open Dataset Challenges 2020 (June, 2020).
PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
Shaoshuai Shi, Chaoxu Guo, Li Jiang, Zhe Wang, Jianping Shi, Xiaogang Wang, Hongsheng Li
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [Code]
Ranked 1st place on KITTI 3D object detection benchmark (Car, Nov 2019 - Aug 2020).
PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [Oral] [Code]
Ranked 1st place on ScanNet 3D Semantic instance benchmark (Nov-16 2019).
From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network
Shaoshuai Shi, Zhe Wang, Jianping Shi, Xiaogang Wang, Hongsheng Li
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), accepted, 2020. (IF: 24.314) [Code]
ESI Hot and Highly Cited Paper (top 0.1%).
Ranked 1st place on KITTI 3D object detection benchmark (Car, July-9 2019).
SegVoxelNet: Exploring Semantic Context and Depth-aware Features for 3D Vehicle Detection from Point Cloud
Hongwei Yi, Shaoshuai Shi, Mingyu Ding, Jiankai Sun, Kui Xu, Hui Zhou, Zhe Wang, Sheng Li, Guoping Wang
International Conference on Robotics and Automation (ICRA), 2020.
PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud
Shaoshuai Shi, Xiaogang Wang, Hongsheng Li
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [Code-v1] [Code-v2]
The top-10 cited papers among all CVPR-2019 papers (March, 2021), refer to here.
Feature Intertwiner for Object Detection
Hongyang Li, Bo Dai, Shaoshuai Shi, Wanli Ouyang, Xiaogang Wang
International Conference on Learning Representation (ICLR), 2019.
GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction
Li Jiang, Shaoshuai Shi, Xiaojuan Qi, Jiaya Jia
European Conference on Computer Vision (ECCV), 2018. [Oral]
FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates
Yijin Guan, Hao Liang, Ningyi Xu, Wenqiang Wang, Shaoshuai Shi, Xi Chen, Guangyu Sun, Wei Zhang, Jason Cong
IEEE Field-Programmable Custom Computing Machines (FCCM), 2017.

Experience

Postdoc Researcher, Oct 2021 - Nov 2023, Max Planck Institute for Informatics, Germany.
Advised by Prof. Bernt Schiele.
Research Intern, Mar 2019 - Feb 2021, Autonomous Driving Group of SenseTime, Shenzhen, China.
Working with Dr. Zhe Wang and Dr. Jianping Shi.
Research Intern, July 2016 - July 2017, System Group of Microsoft Research Asia (MSRA), Beijing, China.
Advised by Prof. Ningyi Xu.

Professional Activities

Conference Reviewer:

      IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
      IEEE International Conference on Computer Vision (ICCV)
      European Conference on Computer Vision (ECCV)
      Conference on Neural Information Processing Systems (NeurIPS)
      AAAI Conference on Artificial Intelligence (AAAI)
      IEEE International Conference on Intelligent Robots and Systems (IROS)
      IEEE International Conference on Robotics and Automation (ICRA)

Journal Reviewer:

      IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
      International Journal of Computer Vision (IJCV)
      IEEE Transactions on Image Processing (TIP)
      IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
      IEEE Transactions on Robotics (TR)
      IEEE Transactions on Multimedia (TMM)
      IEEE Transactions on Intelligent Transportation Systems (TITS)
      IEEE Robotics and Automation Letters (RA-L)
      IEEE Sensors Journal
      Neurocomputing

Teaching Experience

Teaching Assistant of the following courses:

ENGG 2740A (CUHK), Differential Equations for Engineers, Spring 2021
ENGG 2450A (CUHK), Probability and Statistics for Engineers, Fall 2020
ENGG 5202 (CUHK), Pattern Recognition, Spring 2020
ENGG 2420B (CUHK), Complex Analysis and Differential Equations for Engineers, Fall 2019
ELEG 2601 (CUHK), Technology, Society and Engineering Practice, Spring 2019
ELEG 2201 (CUHK), Digital Circuits and Computing Systems, Fall 2017 and Fall 2018
ENGG 1100I (CUHK), Introduction to Engineering Design, Spring 2018
HIT, Principles of Computer Organization, Spring 2016
HIT, The High-level Language Programming II, Spring 2015 and Spring 2016