Spatial RAG for Urban Crash Hotspot Discovery and Safety Countermeasure Recommendation

Authors

  • Ziliang Samuel Zhong New York University, NY, USA Author
  • Long Zhang Transportation Systems Engineering, Southern Methodist University, TX, USA Author
  • David Ma Software Engineering, UC Irvine, CA, USA Author

DOI:

https://doi.org/10.69987/JACS.2023.31206

Keywords:

urban crash analysis, hotspot discovery, STATS19, spatial clustering, BIRCH, DBSCAN, kernel density estimation, retrieval-augmented generation, safety countermeasures, traffic exposure

Abstract

Urban crash screening systems often identify high-burden places but leave safety analysts to translate those places into treatment concepts. This paper develops a Spatial RAG pipeline for urban crash hotspot discovery and countermeasure recommendation using official 2022 Great Britain road safety and traffic data. The study integrates 106,004 STATS19 collision records, 135,480 casualty records, 193,545 vehicle records, 22,240 AADF count-point rows, 206 local-authority traffic records, and 17,840 MRDB major-road links. The primary spatial benchmark uses the 71,763 geocoded urban injury collisions. National clustering compares BIRCH, MiniBatchKMeans, and DBSCAN on British National Grid coordinates. BIRCH and MiniBatchKMeans both produce 204-center partitions aligned with the number of active highway authorities in the urban sample; MiniBatchKMeans reaches the highest sampled silhouette of 0.460 in 2.94 s, while BIRCH reaches 0.447 in 3.07 s and supplies the hierarchical center structure used for downstream hotspot screening. DBSCAN identifies 679 dense components but leaves 29.9% of crashes as noise and forms a 20,192-crash largest component, making it less suitable as the national partition. Within the ten highest-burden BIRCH centers, Spatial RAG captures 14.87% of held-out severity burden at a 5% cell budget and reaches a mean severity-AUC10 of 0.1405, compared with 11.13% and 0.1093 for KDEGrid. The paired center-wise advantage over KDEGrid is significant with a one-sided Wilcoxon p-value of 0.001953. KDEGrid remains the most stable method at the top-5% budget, with a Jaccard overlap of 0.724. The retrieval layer assigns eight of the top ten hotspots to an urban vulnerable-road-user area-wide package and two to a major-road corridor speed-management package. The results show that severity-aware spatial screening and deterministic retrieval can convert an official crash archive into a transparent first-pass safety planning tool.

Author Biography

  • David Ma, Software Engineering, UC Irvine, CA, USA

     

     

     

Downloads

Published

2023-12-19

How to Cite

Ziliang Samuel Zhong, Long Zhang, & David Ma. (2023). Spatial RAG for Urban Crash Hotspot Discovery and Safety Countermeasure Recommendation. Journal of Advanced Computing Systems , 3(12), 45-61. https://doi.org/10.69987/JACS.2023.31206

Share