A Novel Hybrid Deep Learning-Probabilistic Framework for Real-Time Crash Detection from Monocular Traffic Video
Tarih
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Erişim Hakkı
Özet
The rapid evolution of autonomous vehicle technologies has amplified the need for crash detection that operates robustly under complex traffic conditions with minimal latency. We propose a hybrid temporal hierarchy that augments a Region-based Convolutional Neural Network (R-CNN) with an adaptive time-variant Kalman filter (with total-variation prior), a Hidden Markov Model (HMM) for state stabilization, and a lightweight Artificial Neural Network (ANN) for learned temporal refinement, enabling real-time crash detection from monocular video. Evaluated on simulated traffic in CARLA and real-world driving in Istanbul, the full temporal stack achieves the best precision-recall balance, yielding 83.47% F1 offline and 82.57% in real time (corresponding to 94.5% and 91.2% detection accuracy, respectively). Ablations are consistent and interpretable: removing the HMM reduces F1 by 1.85-2.16 percentage points (pp), whereas removing the ANN has a larger impact of 2.94-4.58 pp, indicating that the ANN provides the largest marginal gains-especially under real-time constraints. The transition from offline to real time incurs a modest overall loss (-0.90 pp F1), driven more by recall than precision. Compared to strong single-frame baselines, YOLOv10 attains 82.16% F1 and a real-time Transformer detector reaches 82.41% F1, while our full temporal stack remains slightly ahead in real time and offers a more favorable precision-recall trade-off. Notably, integrating the ANN into the HMM-based pipeline improves accuracy by 2.2%, while the time-variant Kalman configuration reduces detection lag by approximately 0.5 s-an improvement that directly addresses the human reaction time gap. Under identical conditions, the best RCNN-based configuration yields AP@0.50 approximate to 0.79 with an end-to-end latency of 119 +/- 21 ms per frame (similar to 8-9 FPS). Overall, coupling deep learning with probabilistic reasoning yields additive temporal benefits and advances deployable, camera-only crash detection that is cost-efficient and scalable for intelligent transportation systems.












