HD-Map
Learning to Drive

1. HD-Map

LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation 2023 generate HD map with lidar and BEV images.

High-Definition Map Generation Technologies For Autonomous Driving 2022

Data collection methods.
Point cloud map generation methods. Better see Lidar mapping algorithm papers.
Feature extraction methods for HD maps.
- Road Network Extraction:
  - 2D Aerial Images : segmentation-based, iterative graph growing, and graph-generation methods.
  - 3D Point Clouds (using segmentation).
  - Sensor Fusion Methods : use both pcls, (aerial/car) images, GPS trajectories.
- Road Markings Extraction : 2D (aerial/car) images or 3D point clouds (bottom-up method and top-down method).
- Pole-like Objects Extraction: usually based on segmentation and classification on MLS 3D point clouds
Framework for HD maps:
- Lanelet2 : physical layer (points and lines), relational layer, and topological layer.
- OpenDRIVE : reference line/road (various geometric primitives), lane, and features.
- Apollo Maps : uses points. Road, Intersection, Traffic signal, Logical relationship & Others.

Localization using HD map:

YOLOP: You Only Look Once for Panoptic Driving Perception 2021, YOLO for car images.
Coarse-to-fine Semantic Localization with HD Map for Autonomous Driving in Structural Scenes 2021, segmentation to gradient map, use direct method for pose refinement.

Computing Systems for Autonomous Driving: State-of-the-Art and Challenges 2020. focus on hardware side.

Towards End-to-End Lane Detection: an Instance Segmentation Approach 2018, github lane segmentation.

Computer Recognition of Roads from Satellite Pictures 1976

2. Learning to Drive

Take advantages of Transformers.

Traditional CV missions (classification, segmentation, etc) are not fit for auto-drive mission.
Compared to ChatGPT, these models are small. No large model in general Computer Vision yet.
- Or we might not be able to dig vision data from internet as NLP did - no easy ‘gt’ could be found.
- The driving task is still too simple, does not require high level understanding. (we need a better task to dig visual based AI, text-image related tasks might be good)

Make Large Dataset from online videos: how to make large dataset:

video online: no calibration, vision only, on real scale.
slam mapped dataset (require online video mapping algorithm).

Planning-oriented Autonomous Driving 2023. Large model for auto-drive, an end-to-end paradigm unites modules in perception and prediction. Combine different models together, and jointly optimize them. Made a good starting point for further work.

PPGeo: Policy Pre-training for Autonomous Driving via Self-supervised Geometric Modeling 2023.

In the first stage, the geometric modeling framework generates pose and depth predictions simultaneously, with two consecutive frames as input.
In the second stage, the visual encoder learns driving policy representation by predicting the future ego-motion and optimizing with the photometric error based on current visual observation only.
Decision Intelligence Platform for Autonomous Driving simulation.

ACO: Learning to Drive by Watching YouTube videos: Action-Conditioned Contrastive Policy Pretraining 2022. Use ‘pseudo label of action’ (made by a supervised - Inverse dynamics model) to make a model ‘learn the features that matter to the output action’, which could be further transformed to other tasks.

data set list, data set drive.
Train with : Instance Contrastive Pair (ICP) and Action Contrastive Pair (ACP).
Inverse dynamics : DL Dense Optical Flow RAFT.

TCP - Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple yet Strong Baseline 2022. two branches for trajectory planning and direct control, respectively.

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos 2022, openai page. Learn to act by watching Minecraft game videos. Fun!. gets pseudo action labels from a trained Inverse Dynamics Model.

Momentum Contrast for Unsupervised Visual Representation Learning 2020, github page. Contrastive learning creates supervisory labels via considering each image (instance) in the dataset forms a unique category and applies the learning objective of instance discrimination.

Table of Contents

1. HD-Map

2. Learning to Drive