Railway Multi-Modal LiDAR and 2D Annotation
How Shaip delivered one of the most annotation-type-diverse projects in the rail AI domain — combining 2D camera imagery and 3D LiDAR point clouds across 39+ object classes and 8 distinct label types to power autonomous train perception and railway safety systems.
Project Overview
As autonomous train perception moves toward production deployment, the client needed a large-scale, multi-modal annotation pipeline capable of labeling the full railway environment across both 2D camera imagery and 3D LiDAR point cloud data with consistent spatial accuracy.
Shaip built the end-to-end multi-modal annotation pipeline covering 8 distinct label types, 39+ object classes, and 25+ signal state classifications across the H/V and Ks signal systems — delivering cross-validated, model-ready datasets for railway autonomy and safety AI.
Key Status
Object Classes
39+
Annotation Types
8
Signal States
25+
Modalities
2D + 3D
Challenges
- Handling 2D + 3D multi-modal annotation with cross-validated spatial consistency
- Annotating 39+ object classes spanning people, vehicles, infrastructure, animals, and safety objects
- Operating across 8 annotation types — 2D/3D bounding boxes, cuboids, polylines, polygons
- Classifying 25+ distinct signal states across the H/V and Ks German signal systems
- Applying crowd threshold logic and continuous track annotation under wagon objects
Solution
Multi-Modal Annotation Pipeline
Shaip configured a parallel 2D-camera and 3D-LiDAR annotation pipeline with cross-validation between modalities. Annotators worked simultaneously with both data sources to ensure spatial consistency, with occlusion states assessed purely from point cloud data rather than 2D images.
39+ Class Ontology
The annotation ontology covered people (with age group, mobility aid, functional role, body pose, carrying items, and distraction attributes), vehicles (bicycles, motorcycles, road vehicles by sub-category), 18 species of animals, and rail infrastructure including tracks, switches, catenary poles, signals, buffer stops, signal bridges, drag shoes, and reflective test objects.
Signal State Classification
Signals were annotated with detailed aspect classification across H/V and Ks signal systems, covering 25+ distinct signal states. Annotators identified exact signal aspects from both front and back views, distinguishing between light and shape signals across multiple German railway signalling systems.
Crowd Threshold & Group Logic
Crowd annotations required a minimum threshold of 6 overlapping persons before switching from individual to group annotation. Group of bicycles and group of animals followed the same threshold logic with consistent pose and species rules within groups. This ensured high-density scenes were labeled with operational efficiency without sacrificing detail.
Continuous Track Annotation
Tracks were annotated continuously even beneath train or wagon objects, with switch areas and buffer stops mapped precisely. This continuity layer is essential for downstream rail path-planning and obstacle detection models.
Project Scope
| Dataset Type | Modalities | Classes | Label Types | Signal States | Systems |
|---|---|---|---|---|---|
| Rail environment perception | 2D + 3D LiDAR | 39+ | 8 (boxes, cuboids, polylines, polygons) | 25+ | H/V, Ks |
Outcomes
- Established a multi-modal 2D + 3D LiDAR annotation pipeline for autonomous train perception
- Standardized 39+ object class ontology spanning people, vehicles, animals, and rail infrastructure
- Delivered 25+ signal state classifications across H/V and Ks German signal systems
- Implemented crowd threshold logic and continuous track annotation beneath wagons
- Enabled the client’s autonomous train, safety, and obstacle detection AI roadmap
Overall, Shaip helped transform a multi-modal rail perception requirement into a structured, production-ready annotation pipeline — one capable of supporting autonomous train operation, railway safety AI, and signal recognition systems with cross-validated 2D-3D spatial consistency at scale.
Shaip handled annotation complexity that most vendors won’t touch. 39+ classes, 8 label types, 25+ signal states, 2D + 3D cross-validation — and they delivered it as a single coordinated pipeline. Our rail perception models trained faster as a result.
– VP, Autonomous Rail Systems