Vehicle Detection & Tracking.
A real-time multi-object pipeline built around YOLOv8n + ByteTrack — the project that earned the First Class Honours.
Detecting a car in a single frame is the easy half. Following that same car across hundreds of frames — through occlusions, light changes, and other vehicles crossing its path — is where most pipelines fall over.
Research question
Could a lightweight detector (YOLOv8n) paired with a tracking-by-association algorithm (ByteTrack) deliver real-time, identity-stable vehicle tracking on consumer-grade hardware? The honours thesis tested that hypothesis on traffic-flow footage and benchmarked the result against heavier two-stage detectors.
Pipeline
- Frame ingest — sampled traffic-flow footage at 30 fps and standardised resolution.
- Detection — YOLOv8n produced bounding boxes per frame, fine-tuned on a curated subset of vehicle classes.
- Tracking — ByteTrack associated detections across frames, including low-confidence boxes that earlier trackers like SORT discard.
- Evaluation — measured MOTA, IDF1, and per-class precision/recall against ground-truth annotations.
Tools used
Findings
- YOLOv8n + ByteTrack maintained identity stability through brief occlusions where SORT-based pipelines lost the track.
- Throughput remained real-time on a single mid-range GPU — opening the door to on-device deployment for traffic monitoring without cloud round-trips.
- The biggest accuracy gains came not from the model, but from better data curation: aggressive class-balancing and removing duplicate frames lifted recall more than any hyperparameter change.
What I learned
The thesis was a study in scope discipline. There’s always a more elaborate model to try, a fancier metric to add. The work that earned the grade was the boring half: clean labels, honest evaluation, and a write-up a reader could follow without a PhD. That’s exactly the muscle a data analyst uses every day.
What’s next
I’m exploring how the same pipeline could power a small-council pedestrian-safety dashboard — identifying near-miss intersections from existing CCTV without ever storing personally-identifying footage.