The Network Model File Geodatabase provides a link-node representation of the road network, with related tables for lanes, roads, streets, junctions, access restrictions, vehicle restrictions, and turn restrictions.
In this project it is most useful as a source of network structure and road-infrastructure exposure. It can support structural risk proxies such as junction complexity, carriageway type, lane count, and restriction density. Observed traffic, collision, speed, and temporal-flow concepts should be checked explicitly before they are treated as available model inputs.
Important
Use this dataset as a road supply and network structure source. For calibrated road risk, join it to observed exposure datasets such as AADF or WebTRIS and to collision data such as STATS19.
2 Role In The Pipeline
Provides link-level road geometry and attributes for road exposure features.
Provides lane, junction, and restriction tables for structural risk features.
Supports completeness checks against OS Open Roads, MRDB, and modelled road links.
Can be aggregated by year, geography, road class, ownership, or operational status.
Helps distinguish structural exposure from measured or estimated traffic exposure.
Recommended modelling grain:
one row per Link.linkid
The Link layer should be treated as the spine of the feature table. Related tables can be joined or aggregated to linkid.
3 Load And Inspect
Set gdb_path to the local File Geodatabase directory. The project root is resolved from road_risk.config so the notebook works whether it is run from the repository root or from quarto/data-sources.
Code
from pathlib import Pathimport fionaimport geopandas as gpdimport pandas as pdfrom road_risk.config import _ROOTpd.set_option("display.max_columns", 80)gdb_path = _ROOT / ("data/raw/Network_Model_(Public_Download)/""d77ab8dc-afaa-4475-af6f-7dfdaea5b135.gdb")
4 Source Documentation
Official source page: Network Model Public, National Highways on data.gov.uk. The source page describes the dataset as representing England’s Strategic Road Network and notes that speed limit and smart motorway information were removed from the initial release pending validation.
5 Coverage Summary
The coverage summary below is computed from the Link layer. It is intended to answer the first modelling question: whether this dataset is a full road-network source or a specialist source for a subset of roads.
print("Interpretation: this GDB is best treated as an authoritative National ""Highways / Strategic Road Network source. It is not a full all-road ""network. In this extract, coverage is overwhelmingly motorway and ""trunk A-road links managed by National Highways. It should therefore ""be integrated into the project as a facility-family-conditional source: ""rich SRN features for SRN links, not imputed pseudo-coverage for the ""wider local-road network.")
Interpretation: this GDB is best treated as an authoritative National Highways / Strategic Road Network source. It is not a full all-road network. In this extract, coverage is overwhelmingly motorway and trunk A-road links managed by National Highways. It should therefore be integrated into the project as a facility-family-conditional source: rich SRN features for SRN links, not imputed pseudo-coverage for the wider local-road network.
Code
non_srn_links = links_for_coverage[ links_for_coverage["srn"].fillna("<missing>").astype(str).ne("Y")].copy()if non_srn_links.empty:print("No non-SRN links found in the Link layer.")else: cols = ["linkid","roadname","linkcategory","linkform","carriageway","ownership","srn","operationalstate","SHAPE_Length", ] display( non_srn_links[cols] .sort_values(["roadname", "linkcategory", "linkform"]) .reset_index(drop=True) )
linkid
roadname
linkcategory
linkform
carriageway
ownership
srn
operationalstate
SHAPE_Length
0
{AA2D189D-A767-5DAB-DA16-03BE79C5320A}
A12
A
SL
M
NH
N
O
447.799507
1
{AA2B6DE3-FF0A-1F5B-5D55-B86E908615CA}
A2270
A
SC
A
NH
N
O
102.007383
2
{F8F74D61-2FCB-48F7-8E72-DC9A481C686A}
A23
A
DC
A
NH
N
O
126.257730
3
{92946B85-E28B-0346-27D5-AC4BD2A58234}
A293
A
SC
A
NH
N
O
19.971946
4
{412F7E8F-1A83-1F7B-875F-628E4378D1EE}
A293
A
SC
A
NH
N
O
16.548472
5
{8C55DC9B-C39B-4F17-EB98-0ADD99E294B9}
A293
A
SC
A
NH
N
O
24.599465
6
{2BBF8039-DACD-4C37-E2D6-949AF00FF0EF}
A303
A
SL
K
NH
N
O
166.813493
7
{F7024204-FB1E-22C9-4C13-CAB742C05014}
A47
A
SL
J
NH
N
O
236.004304
8
{7A9381E0-9ABC-F309-49F9-35F74DACC01F}
A66
A
SL
L
NH
N
O
129.248274
9
{0E7AE448-0182-154B-C220-A1402509AACB}
A66
A
SL
J
NH
N
O
112.563493
10
{A1F5A0EC-2778-BE11-94BC-0FA3B7103116}
M1
U
SR
X
NH
N
O
1372.397979
11
{E61D80B1-6A39-0EE3-C245-C68DABC0E78E}
M1
U
SR
X
NH
N
O
48.496226
12
{2292C08B-BF4D-0D29-2C73-3ACA94492933}
M1
U
SR
X
NH
N
O
1418.420468
13
{D3360D8B-F3B5-2F8C-618C-B0CC4E1F38D6}
M25
M
SR
X
NH
N
O
68.832008
14
{978452EA-3164-C8C1-01EA-911C8DFD01A6}
M25
M
SR
X
NH
N
O
63.410833
15
{C6BC13A2-8CFA-10D4-5D87-505A87698952}
M25
M
SR
X
NH
N
O
44.360923
16
{EC0C811A-C2EC-C46C-4DCF-9CEE829AB7AA}
M4
M
SL
L
NH
N
O
298.064144
17
{6381DEAE-C210-741F-C2C1-36690AF6BCD0}
M4
M
SL
L
NH
N
O
439.638495
18
{78F6A432-5A91-464A-26B5-44AE28098215}
M4
M
SL
K
NH
N
O
362.996218
19
{049BD5B0-4E5B-B8A0-D9B9-3F9EFDC34E69}
M4
M
SL
J
NH
N
O
337.686443
20
{0B63E859-6260-8091-C73B-03CA5A048653}
Unclassified Unnamed Road
U
SL
J
NH
N
O
39.363534
21
{785C183B-D42D-BBCE-EE74-F4A06104B633}
Unclassified Unnamed Road
U
SL
K
NH
N
O
63.440015
22
{DCF08C31-FD10-AACD-7010-24EAC0745756}
Unclassified unnamed road
U
DC
A
NH
N
O
60.217838
23
{9E61B238-6CFF-A1C2-0D5E-91E8C0CCC83C}
Unclassified unnamed road
U
DC
B
NH
N
O
76.430320
Note
For this project, the key integration consequence is that the Network Model GDB does not replace the all-road backbone from OS Open Roads or OSM. It enriches the SRN subset with more authoritative geometry, lane, carriageway, grade-separation, and restriction features.
Tip
For exposure features such as lane_km, remember that the link geometry already contains separated carriageways, slip roads, junction arms, and directional splits. Avoid applying a second manual two-direction multiplier unless the feature definition explicitly requires it.
6 Constraints On Use
These checks are deliberately near the top of the page because they affect whether and how this source should be integrated into the modelling pipeline.
Code
def active_in_year(df: pd.DataFrame, year: int) -> pd.Series: year_start = pd.Timestamp(year=year, month=1, day=1, tz="UTC") year_end = pd.Timestamp(year=year, month=12, day=31, tz="UTC") start = pd.to_datetime(df["startdate"], errors="coerce", utc=True) end = pd.to_datetime(df["enddate"], errors="coerce", utc=True) starts_before_year_end = start.isna() | (start <= year_end) ends_after_year_start = end.isna() | (end >= year_start)return starts_before_year_end & ends_after_year_startyears =range(2015, 2025)yearly_validity = []for year in years: active = links_for_coverage[active_in_year(links_for_coverage, year)] yearly_validity.append( {"year": year,"active_links_by_validity_dates": len(active),"active_link_length_km": active["SHAPE_Length"].sum() /1000, } )yearly_validity = pd.DataFrame(yearly_validity)with fiona.open(gdb_path, layer="Speed_Limit") as src: speed_limit_rows =len(src)constraints = pd.DataFrame( [ {"constraint": "Coverage family","finding": f"{motorway_or_a_pct:.1f}% of links are motorway or A-road; {srn_y_pct:.1f}% have srn = Y.","modelling_implication": "Treat as SRN / trunk-road enrichment, not all-road coverage.", }, {"constraint": "Validity-date coverage","finding": (f"{(yearly_validity['active_links_by_validity_dates'] ==0).sum()} ""model years have zero active links under startdate/enddate." ),"modelling_implication": "Do not use validity dates as proof of historical availability without source confirmation.", }, {"constraint": "Speed limit","finding": f"Speed_Limit rows: {speed_limit_rows:,}.","modelling_implication": "Use another source for speed-limit features if this table is empty.", }, {"constraint": "Smart motorway","finding": (f"Non-missing smartmotorway values: "f"{links_for_coverage['smartmotorway'].notna().sum():,}." ),"modelling_implication": "Do not create a smart_motorway_flag unless values are populated.", }, {"constraint": "Constant fields","finding": (f"ownership unique values: {links_for_coverage['ownership'].nunique(dropna=True)}; "f"operationalstate unique values: {links_for_coverage['operationalstate'].nunique(dropna=True)}." ),"modelling_implication": "Constant fields are useful QA signals but not predictive features.", }, ])display(constraints)display(yearly_validity)
constraint
finding
modelling_implication
0
Coverage family
99.9% of links are motorway or A-road; 99.9% h...
Treat as SRN / trunk-road enrichment, not all-...
1
Validity-date coverage
5 model years have zero active links under sta...
Do not use validity dates as proof of historic...
2
Speed limit
Speed_Limit rows: 0.
Use another source for speed-limit features if...
3
Smart motorway
Non-missing smartmotorway values: 0.
Do not create a smart_motorway_flag unless val...
4
Constant fields
ownership unique values: 1; operationalstate u...
Constant fields are useful QA signals but not ...
year
active_links_by_validity_dates
active_link_length_km
0
2015
0
0.000000
1
2016
0
0.000000
2
2017
0
0.000000
3
2018
0
0.000000
4
2019
0
0.000000
5
2020
75
19.008889
6
2021
84
21.349183
7
2022
41725
23531.010355
8
2023
41880
23576.021013
9
2024
42796
23855.699972
Warning
The startdate and enddate fields do not provide reliable historical coverage for this project’s full 2015-2024 modelling window without additional source confirmation. Treat the GDB as a current SRN structural snapshot unless a separate historical validity method is established.
Important
The clean integration path is facility-family conditional: use Network Model features inside an SRN-specific model or SRN feature branch, and keep non-SRN links on the OS Open Roads / OSM feature set. Imputing these authoritative SRN fields across the full network would create the same kind of coverage bias as sparse OSM-derived features.
7 Source Snapshot
The inventory below is generated from the File Geodatabase at render time.
Code
layer_roles = {"Node": "Network topology and junction proximity","Link": "Main road segment geometry and attributes","Access_Restriction": "Access restriction descriptions","Vehicle_Restriction": "Vehicle restriction descriptions","Turn_Restriction": "Turn movement restrictions","Street": "Street-level metadata and surface","Lane": "Lane-level widths and offsets","Speed_Limit": "Speed limit records if populated","Access_Restriction_Reference": "Links restrictions to links","Access_Restriction_Inclusion": "Access inclusion details","Access_Restriction_Exemption": "Access exemption details","Junction": "Junction type and naming","Junction_Reference": "Links junctions to nodes","Road": "Road name and classification","Road_Reference": "Links roads to links","Street_Reference": "Links streets to links","Turn_Restriction_Reference": "Links turn restrictions to ordered links","Turn_Restriction_Inclusion": "Turn inclusion details if populated","Turn_Restriction_Exemption": "Turn exemption details","Vehicle_Restriction_Reference": "Links vehicle restrictions to links/nodes","Vehicle_Node_Restriction_Reference": "Vehicle-node restriction references","Vehicle_Restriction_Inclusion": "Vehicle restriction inclusion details","Vehicle_Restriction_Exemption": "Vehicle restriction exemption details","Street_Interest": "Street works interest metadata","Street_Construction": "Street construction metadata","Street_Special_Designation": "Street special designation metadata","Street_Special_Designation_Points": "Point special designations","Street_Special_Designation_Lines": "Line special designations","Street_Special_Designation_Polygons": "Polygon special designations",}rows = []for layer in fiona.listlayers(gdb_path):with fiona.open(gdb_path, layer=layer) as src: crs = src.crs.to_string() if src.crs elseNone rows.append( {"layer": layer,"rows": len(src),"crs": crs,"fields": len(src.schema["properties"]),"geometry": src.schema.get("geometry"),"main_use": layer_roles.get(layer, "Review before use"), } )inventory = pd.DataFrame(rows).sort_values(["rows", "layer"], ascending=[False, True])display(inventory)
layer
rows
crs
fields
geometry
main_use
9
Lane
104840
None
15
None
Lane-level widths and offsets
18
Street_Reference
86191
None
8
None
Links streets to links
17
Road_Reference
49345
None
9
None
Links roads to links
1
Link
42960
EPSG:3857
28
3D MultiLineString
Main road segment geometry and attributes
19
Turn_Restriction_Reference
39101
None
11
None
Links turn restrictions to ordered links
0
Node
37874
EPSG:3857
12
3D Point
Network topology and junction proximity
4
Turn_Restriction
37474
EPSG:3857
9
3D MultiLineString
Turn movement restrictions
5
Street
9274
EPSG:3857
17
3D MultiLineString
Street-level metadata and surface
15
Junction_Reference
6946
None
9
None
Links junctions to nodes
14
Junction
762
None
11
None
Junction type and naming
16
Road
617
None
9
None
Road name and classification
11
Access_Restriction_Reference
246
None
11
None
Links restrictions to links
2
Access_Restriction
108
EPSG:3857
9
3D Point
Access restriction descriptions
13
Access_Restriction_Exemption
40
None
9
None
Access exemption details
12
Access_Restriction_Inclusion
40
None
9
None
Access inclusion details
3
Vehicle_Restriction
18
EPSG:3857
11
3D Point
Vehicle restriction descriptions
22
Vehicle_Restriction_Reference
18
None
13
None
Links vehicle restrictions to links/nodes
21
Turn_Restriction_Exemption
1
None
9
None
Turn exemption details
10
Speed_Limit
0
None
17
None
Speed limit records if populated
27
Street_Construction
0
None
11
None
Street construction metadata
26
Street_Interest
0
None
13
None
Street works interest metadata
28
Street_Special_Designation
0
None
13
None
Street special designation metadata
7
Street_Special_Designation_Lines
0
EPSG:3857
15
3D MultiLineString
Line special designations
6
Street_Special_Designation_Points
0
EPSG:3857
14
3D Point
Point special designations
8
Street_Special_Designation_Polygons
0
EPSG:3857
16
3D MultiPolygon
Polygon special designations
20
Turn_Restriction_Inclusion
0
None
9
None
Turn inclusion details if populated
23
Vehicle_Node_Restriction_Reference
0
None
9
None
Vehicle-node restriction references
25
Vehicle_Restriction_Exemption
0
None
9
None
Vehicle restriction exemption details
24
Vehicle_Restriction_Inclusion
0
None
9
None
Vehicle restriction inclusion details
Note
Layers with zero rows should not be dropped from the documentation entirely. Their presence is useful because it shows that a concept exists in the schema.
Code
empty_layers = inventory.loc[inventory["rows"].eq(0), ["layer", "geometry", "main_use"]]if empty_layers.empty:print("No empty layers in this GDB extract.")else: display(empty_layers)
layer
geometry
main_use
10
Speed_Limit
None
Speed limit records if populated
27
Street_Construction
None
Street construction metadata
26
Street_Interest
None
Street works interest metadata
28
Street_Special_Designation
None
Street special designation metadata
7
Street_Special_Designation_Lines
3D MultiLineString
Line special designations
6
Street_Special_Designation_Points
3D Point
Point special designations
8
Street_Special_Designation_Polygons
3D MultiPolygon
Polygon special designations
20
Turn_Restriction_Inclusion
None
Turn inclusion details if populated
23
Vehicle_Node_Restriction_Reference
None
Vehicle-node restriction references
25
Vehicle_Restriction_Exemption
None
Vehicle restriction exemption details
24
Vehicle_Restriction_Inclusion
None
Vehicle restriction inclusion details
8 Expected Values
Expected values should be checked from the GDB rather than hard-coded into the model. The useful fields are mostly categorical domains and linkable keys.
8.1 Core Link Fields
The Link layer fields and their project roles are generated below from the source schema.
Code
link_field_roles = {"linkid": ("Unique link key", "Primary feature table key"),"linkref": ("Human-readable link reference", "Diagnostics and matching"),"linkcategory": ("Road category", "Exposure and risk stratification"),"linkdesc": ("Road description", "Diagnostics"),"linkform": ("Physical or functional link form", "Structural risk feature"),"directionality": ("One-way / two-way status", "Routing and conflict proxy"),"direction": ("Direction description", "Routing and diagnostics"),"numberoflanes": ("Lane count", "Exposure and capacity proxy"),"smartmotorway": ("Smart motorway flag/category", "Availability check before modelling"),"carriageway": ("Carriageway type", "Risk and severity proxy"),"ownership": ("Owning authority/operator", "Governance and coverage"),"startgradeseparation": ("Start grade-separation level", "Junction and conflict proxy"),"endgradeseparation": ("End grade-separation level", "Junction and conflict proxy"),"parentlinkref": ("Parent link reference", "De-duplication / hierarchy"),"srn": ("Strategic Road Network flag/category", "Major-network segmentation"),"startnode": ("From-node key", "Topology"),"endnode": ("To-node key", "Topology"),"operationalstate": ("Operational status", "Filtering and coverage"),"roadname": ("Road name", "Reporting and corridor grouping"),"startdate": ("Valid-from date", "Yearly coverage"),"enddate": ("Valid-to date", "Yearly coverage"),"toid": ("Topographic object identifier", "Cross-dataset matching"),"SHAPE_Length": ("Link length in CRS units", "Length exposure"),}with fiona.open(gdb_path, layer="Link") as src: link_schema = pd.DataFrame( [ {"field": field, "source_type": source_type}for field, source_type in src.schema["properties"].items() ] )link_roles = pd.DataFrame( [ {"field": field, "expected_role": role, "model_use": model_use}for field, (role, model_use) in link_field_roles.items() ])display(link_schema.merge(link_roles, on="field", how="left"))
field
source_type
expected_role
model_use
0
linkid
str
Unique link key
Primary feature table key
1
linkref
str:255
Human-readable link reference
Diagnostics and matching
2
linkcategory
str:255
Road category
Exposure and risk stratification
3
linkdesc
str:255
Road description
Diagnostics
4
linkform
str:255
Physical or functional link form
Structural risk feature
5
directionality
str:255
One-way / two-way status
Routing and conflict proxy
6
direction
str:255
Direction description
Routing and diagnostics
7
numberoflanes
int32
Lane count
Exposure and capacity proxy
8
smartmotorway
str:255
Smart motorway flag/category
Availability check before modelling
9
carriageway
str:255
Carriageway type
Risk and severity proxy
10
ownership
str:255
Owning authority/operator
Governance and coverage
11
startgradeseparation
int16
Start grade-separation level
Junction and conflict proxy
12
endgradeseparation
int16
End grade-separation level
Junction and conflict proxy
13
parentlinkref
str
Parent link reference
De-duplication / hierarchy
14
srn
str:255
Strategic Road Network flag/category
Major-network segmentation
15
startnode
str
From-node key
Topology
16
endnode
str
To-node key
Topology
17
operationalstate
str:255
Operational status
Filtering and coverage
18
roadname
str:255
Road name
Reporting and corridor grouping
19
startdate
datetime
Valid-from date
Yearly coverage
20
enddate
datetime
Valid-to date
Yearly coverage
21
toid
str:255
Topographic object identifier
Cross-dataset matching
22
globalid
str
NaN
NaN
23
created_user
str:255
NaN
NaN
24
created_date
datetime
NaN
NaN
25
last_edited_user
str:255
NaN
NaN
26
last_edited_date
datetime
NaN
NaN
27
SHAPE_Length
float
Link length in CRS units
Length exposure
Key categorical domains to profile:
linkcategory
linkform
directionality
direction
smartmotorway
carriageway
ownership
srn
operationalstate
8.2 Related Table Fields
Useful related fields are checked against the source schema below.
categorical_fields = ["linkcategory","linkform","directionality","direction","smartmotorway","carriageway","ownership","srn","operationalstate",]for col in categorical_fields:print(f"\n{col}") display( links[col] .fillna("<missing>") .astype(str) .value_counts(dropna=False) .rename_axis(col) .reset_index(name="rows") )
linkcategory
linkcategory
rows
0
A
28844
1
M
14081
2
U
22
3
B
13
linkform
linkform
rows
0
DC
23306
1
SL
7357
2
SC
6813
3
R
4893
4
L
507
5
SR
52
6
DL
30
7
EA
2
directionality
directionality
rows
0
1
37240
1
0
5720
direction
direction
rows
0
N
10680
1
S
8909
2
E
8681
3
W
7908
4
CW
5856
5
ACW
926
smartmotorway
smartmotorway
rows
0
<missing>
42960
carriageway
carriageway
rows
0
A
16756
1
B
11578
2
X
6770
3
L
2040
4
J
2028
5
K
1902
6
M
1886
ownership
ownership
rows
0
NH
42960
srn
srn
rows
0
Y
42936
1
N
24
operationalstate
operationalstate
rows
0
O
42960
Code
categorical_suitability = []for col in categorical_fields: s = links[col] non_missing = s.notna().sum() unique_non_missing = s.nunique(dropna=True) categorical_suitability.append( {"field": col,"non_missing": non_missing,"non_missing_pct": round(100* non_missing /len(links), 2),"unique_non_missing": unique_non_missing,"feature_guidance": ("drop: empty"if non_missing ==0else"drop: constant"if unique_non_missing <=1else"usable after code meaning is resolved" ), } )display(pd.DataFrame(categorical_suitability))
field
non_missing
non_missing_pct
unique_non_missing
feature_guidance
0
linkcategory
42960
100.0
4
usable after code meaning is resolved
1
linkform
42960
100.0
8
usable after code meaning is resolved
2
directionality
42960
100.0
2
usable after code meaning is resolved
3
direction
42960
100.0
6
usable after code meaning is resolved
4
smartmotorway
0
0.0
0
drop: empty
5
carriageway
42960
100.0
7
usable after code meaning is resolved
6
ownership
42960
100.0
1
drop: constant
7
srn
42960
100.0
2
usable after code meaning is resolved
8
operationalstate
42960
100.0
1
drop: constant
Warning
Several useful-looking fields are coded domains. Do not treat linkform or carriageway as ordinal or self-explanatory until the National Highways / OS Highways code meanings have been resolved from source documentation or metadata.
10 Yearly Coverage
The GDB has startdate and enddate fields on many layers. These should be interpreted as feature-validity dates, not traffic observation years.
For annual coverage, mark a feature as active in a year if:
startdate <= 31 December of that year
and
enddate is missing or enddate >= 1 January of that year
This gives a structural-network coverage series by year. It does not replace AADF, WebTRIS, or STATS19 year fields.
coverage_diagnostics = {"first_year_with_active_links": ( yearly_coverage.loc[yearly_coverage["active_links"].gt(0), "year"].min()if yearly_coverage["active_links"].gt(0).any()elseNone ),"max_active_links": yearly_coverage["active_links"].max(),"years_with_no_active_links": yearly_coverage.loc[ yearly_coverage["active_links"].eq(0), "year" ].tolist(),}display(pd.Series(coverage_diagnostics, name="value").to_frame())if coverage_diagnostics["years_with_no_active_links"]:print("Some modelling years have no active links under the startdate/enddate ""validity test. Treat this as source validity-date coverage, not as ""proof that the physical road network was absent." )
value
first_year_with_active_links
2020
max_active_links
42796
years_with_no_active_links
[2015, 2016, 2017, 2018, 2019]
Some modelling years have no active links under the startdate/enddate validity test. Treat this as source validity-date coverage, not as proof that the physical road network was absent.
Additional yearly checks:
active links by operationalstate,
active link length by linkcategory,
links created or retired per year using startdate and enddate,
whether yearly coverage changes materially across the modelling window,
whether link identifiers are stable enough to join to annual exposure outputs.
Warning
Do not use created_date or last_edited_date as road-network validity dates. Those fields usually describe database editing history rather than when the road was open to traffic.
11 Geographic Coverage
Geographic coverage should be reported using both geometry bounds and overlay against the project study area.
Useful coverage outputs:
total link length inside the study area,
percentage of links intersecting the study area,
link length by local authority, police force, region, or custom grid cell,
number of links with missing or invalid geometry,
comparison with OS Open Roads or MRDB length by geography,
map of links by linkcategory, carriageway, or operationalstate.
# Optional example: replace this with a real project boundary, local authority# layer, police force boundary layer, or generated grid before using it.boundary_path = _ROOT /"data/external/boundaries/study_area.gpkg"boundary_label ="data/external/boundaries/study_area.gpkg"ifnot boundary_path.exists():print(f"Boundary file not found, skipping area overlay: {boundary_label}")else: areas = gpd.read_file(boundary_path).to_crs(links.crs) links_for_overlay = links[ ["linkid", "linkcategory", "operationalstate", "geometry"] ].copy() overlay = gpd.overlay(links_for_overlay, areas, how="intersection") overlay["length_km"] = overlay.to_crs(3857).length /1000 area_summary = ( overlay.groupby("area_name", dropna=False) .agg( links=("linkid", "nunique"), length_km=("length_km", "sum"), ) .reset_index() .sort_values("length_km", ascending=False) ) display(area_summary)
Boundary file not found, skipping area overlay: data/external/boundaries/study_area.gpkg
For grid-based coverage:
1. create a regular grid over the study area,
2. intersect links with the grid,
3. sum link length and lane-km per cell,
4. flag cells with zero or very low coverage,
5. compare against OS Open Roads, AADF count points, and STATS19 collisions.
12 Link-Level Feature Build
The minimum useful output is a single feature table with one row per linkid.
Keep the exposure denominator explicit. A high total-risk segment may simply be long, multi-lane, or high-volume. A high risk-rate segment is a different question.
15 Good For / Not Good For
Good uses:
SRN / National Highways motorway and trunk A-road structural features.
Authoritative lane count, lane width, carriageway, link form, and link geometry checks on the network subset where the data is populated.
Grade-separation, turn-restriction, access-restriction, and vehicle-restriction features that are not available in OS Open Roads or OSM in the same model-ready form.
SRN-specific model development or a facility-family feature branch.
Poor uses:
Full-network exposure on its own.
Local-authority A-roads, B-roads, C-roads, residential streets, or other minor roads.
Speed-limit, lighting, gradient, curvature, traffic-volume, collision, or temporal-flow modelling without external joins.
Pre-2022 historical network validity for a 2015-2024 model unless source validity dates are independently resolved.
Global model features imputed across non-SRN links.
16 Known Limitations
Use the availability checks below before depending on optional layers or fields.
Validity dates can support source-validity diagnostics, but not annual traffic exposure.
Several categorical fields use opaque codes and need source code-list resolution before modelling.
Non-spatial related tables may not carry a CRS.
Relationship tables are many-to-many in places, so joins need aggregation before merging into a one-row-per-link feature table.
Code
availability_checks = inventory.assign( dependency_status=lambda df: df["rows"].gt(0).map( {True: "available in this GDB", False: "not populated in this GDB"} ))[["layer", "rows", "geometry", "crs", "dependency_status"]]display(availability_checks.sort_values(["rows", "layer"], ascending=[True, True]))