Open Road Risk
  • Home
  • Project
    • Project overview
    • Current model status
    • AI-assisted development
  • Background
    • Metrics and methodology
    • Literature evidence register
  • Literature
    • Crash frequency models
    • Exposure and traffic volume
    • Spatial methods and network risk
    • Junctions and conflict structure
    • Severity modelling
    • Validation and metrics
    • Transferability and open data limits
  • Data Sources
    • Overview
    • STATS19 Collisions
    • OS Open Roads
    • AADF Traffic Counts
    • WebTRIS Sensors
    • Network Model GDB
  • Methodology
    • Methodology Overview
    • Joining the Datasets
    • Feature Engineering
    • Empirical Bayes Shrinkage
  • Exploratory Data Analysis
    • Collision EDA
    • Collision-Exposure Behaviour
    • Vehicle Mix Analysis
    • Road Curvature
    • Months and Days of Week
    • Traffic Volume EDA
    • OSM Coverage
  • Models
    • Modelling Approach
    • Stage 1a: Traffic Volume
    • Stage 1b: Time-Zone Profiles
    • Stage 2: Collision Risk Model
    • Facility Family Split
    • Model Inventory
  • Outputs
    • Top-risk map
  • Future Work

On this page

  • How the sources fit together
  • Source roles in the model
    • Outcome source
    • Network source
    • Exposure sources
    • Context sources
  • Detailed pages

Data sources

This project combines collision, traffic, road-network, and contextual datasets to estimate road risk relative to exposure. Each source has a distinct role: some define the road network, some provide observed outcomes, and others provide traffic or contextual predictors used by the modelling pipeline.

How the sources fit together

Source Main role Used for Key limitation
STATS19 Collision outcome data Observed injury collisions used as the response variable in the collision model Police-reported injury collisions only; excludes damage-only incidents
OS Open Roads Road network geometry Base road-link network for snapping, feature generation, and scoring Simplified representation of real-world road layout
AADF Observed traffic counts Training data for estimating annual average daily traffic where direct counts are unavailable Sparse coverage, especially away from major roads
WebTRIS Traffic timing and vehicle mix Time-of-day profiles, traffic composition, and support for exposure features Biased toward National Highways / major-road sensors
OpenStreetMap Supplementary road attributes Additional tags such as bridge/tunnel or road context where useful Uneven coverage and tagging consistency
MRDB Major-road reference data Major-road structure and classification support Mainly useful for major-road context
LSOA / population data Area context Population-density and local-area contextual features Area-level proxy, not direct road use

Source roles in the model

Outcome source

  • STATS19 provides the observed collision outcome.

Network source

  • OS Open Roads defines the scored road-link network.

Exposure sources

  • AADF provides direct traffic-count training examples.
  • WebTRIS provides time-profile and vehicle-mix information.

Context sources

  • OSM, MRDB, and area-level population data provide supporting road and local-context features.

Detailed pages

  • STATS19
  • OS Open Roads
  • AADF Traffic Counts
  • WebTRIS Sensors
  • Network Model GDB

Next: How the sources are joined

Open Road Risk

 

Built with Quarto