This project combines collision, traffic, road-network, and contextual datasets to estimate road risk relative to exposure. Each source has a distinct role: some define the road network, some provide observed outcomes, and others provide traffic or contextual predictors used by the Open Road Risk modelling approach.
How the sources fit together
| Source | Main role | Used for | Key limitation |
|---|---|---|---|
| STATS19 collision data | Collision outcome data | Observed injury collisions used as the response variable in the collision model | Police-reported injury collisions only; excludes damage-only incidents |
| OS Open Roads network data | Road network geometry | Base road-link network for snapping, feature generation, and scoring | Simplified representation of real-world road layout |
| AADF traffic counts | Observed traffic counts | Training data for estimating annual average daily traffic where direct counts are unavailable | Sparse coverage, especially away from major roads |
| WebTRIS sensor data | Traffic timing and vehicle mix | Time-of-day profiles, traffic composition, and support for exposure features | Biased toward National Highways / major-road sensors |
| Network Model GDB | Scoped SRN network reference | Source notes and candidate structural features for motorway and trunk A-road links | SRN-only; not the all-road scoring backbone |
| OpenStreetMap | Supplementary road attributes | Additional tags such as bridge/tunnel or road context where useful | Uneven coverage and tagging consistency |
| OS Terrain 50 | Terrain context | Link-level grade features derived from the 50 m elevation grid | Bare-earth DTM; no structure correction currently applied, so grade on bridges/tunnels/slip roads may be physically wrong |
| ONS LSOA population estimates | Area context | Population-density features around road links | Area-level proxy, not direct road use |
| ONS 2021 Rural-Urban Classification | Area context | Urban/rural context joined at LSOA level | LSOA-level classification, not road-specific land use |
| English Indices of Deprivation 2025 | Deprivation context | IMD and domain deciles joined at LSOA level | England-only; Indoors sub-domain only (Outdoors carries road-accident indicators -> leakage); contextual, not causal |
Source roles in the model
Outcome source
- STATS19 collision data provides the observed collision outcome.
Network source
- OS Open Roads network data defines the scored road-link network.
- Network Model GDB is retained as a scoped Strategic Road Network reference and candidate structural-feature source, not as the all-road backbone.
Exposure sources
- AADF traffic counts provide direct traffic-count training examples.
- WebTRIS sensor data provides time-profile and vehicle-mix information.
Context sources
- OSM, OS Terrain 50, ONS population estimates, ONS Rural-Urban Classification, and English Indices of Deprivation 2025 provide supporting road and local-context features.
Detailed pages
- STATS19
- OS Open Roads
- AADF Traffic Counts
- WebTRIS Sensors
- Network Model GDB
- OS Terrain 50
- English Indices of Deprivation