Definition
Semantic segmentation is the computer vision task of classifying every pixel in an image into a category, such as road, building, or pedestrian.
Purpose
The purpose is to provide detailed scene understanding for AI applications in self-driving, medical imaging, and robotics.
Importance
- Essential for pixel-level perception in safety-critical systems.
- Enables precise object boundaries compared to bounding boxes.
- Requires large annotated datasets.
- Computationally intensive at high resolutions.
How It Works
- Collect and label pixel-level annotated images.
- Train deep learning models like fully convolutional networks.
- Input image is processed into pixel-level predictions.
- Output mask assigns each pixel to a class.
- Evaluate with metrics like Intersection over Union (IoU).
Examples (Real World)
- Cityscapes dataset: semantic segmentation for urban scenes.
- Tesla Autopilot: pixel-level segmentation for road navigation.
- Medical imaging: segmenting tumors in MRI scans.
References / Further Reading
- Long et al. “Fully Convolutional Networks for Semantic Segmentation.” CVPR 2015.
- Cityscapes Dataset.
- IEEE Transactions on Medical Imaging.