TRAIL Compute Status Dashboard

Auto-refresh every 2h - Last update just now

⚠ Outage detected — some cluster data is stale. Showing the last successful query for each affected cluster.
Member Usage

Member Usage Allocated Cluster - Narval, Trillium - GPU util only on Trillium - Inactive = cpus/gpu<1 & mem<10GB

Member14D GPU HRS14D GPU Util14D Inactive GPU Hrs14D CLUSTERS
lla107284% (38 GPU hrs)0Narval
szylzz28378% (168 GPU hrs)0Narval
cwlee8692% (86 GPU hrs)0Trillium

Member Usage Other Cluster - Killarney, TamIA, Fir, Rorqual - GPU util only on Fir - Inactive = cpus/gpu<1 & mem<10GB

Member14D GPU HRS14D GPU Util14D Inactive GPU Hrs14D CLUSTERS
No member usage in this window.

Member Usage UTIAS Server - DGX, Apollo, Turing, Lovelace, UMS, UM1, UM2, UM3

âš  Only showing data from 4 days temporarily due to recent tracking changes.

Member14D GPU HRS14D Servers14D DAYS USED
galcohen328Lovelace5
atao148Turing, UM32
lock_gpu100DGX5
trail26UMS3
trailbot26UMS3
cfeng10UM33
johnl4Turing1
mlavoie4Turing1
Cluster Usage

Allocated Cluster Usage

ClusterNode TypeGPU NODESLEVELFS14D GPU HRS USED14D GPU HRS TARGET14D QUEUE TIME14D USERS
Narval4xA100-80GB1740.756135513315.9h2 (lla, szylzz)
Trillium (stale · 4d 23h ago)4xH100-80GB623.448866650.0h1 (cwlee)

UTIAS Server Usage

âš  Only showing data from 4 days temporarily due to recent tracking changes.

ClusterNode TypeGPU UTILUsers14D GPU UTIL14D USERS
DGX4xA100-SXM4-40GB0/4—34%1 (lock_gpu)
Apollo (stale · 1d 9h ago)8xV100-32GB0/800%0
Turing4xRTX6000-Ada-48GB4/4085%3 (atao, johnl, mlavoie)
Lovelace4xRTX6000-Ada-48GB4/41 (galcohen)83%3 (atao, galcohen, mlavoie)
UMS1xRTX4090-24GB0/11 (trailbot)54%2 (trail, trailbot)
UM1 (stale · 56d 20h ago)1xRTX4090-24GB0/11 (trail)——
UM2 (stale · 56d 20h ago)1xRTX4090-24GB0/10——
UM31xRTX4090-24GB0/13 (atao +2)13%3 (atao, cfeng, dawnd)

Other Cluster Usage

ClusterNode TypeGPU NODESLEVELFS14D GPU HRS USED14D GPU HRS TARGET14D QUEUE TIMELocation
Killarney-L40S (stale · 18d 7h ago)4xL40S-48GB1681.2590——Vector, Toronto
Killarney-H100 (stale · 18d 7h ago)8xH100-48GB101.2590——Vector, Toronto
TamIA4xH100-HGX-80GB5316.0110——Mila, Montreal
Fir (stale · 2d 15h ago)4xH100-SXM5-80GB159 (59 MIG)0.0050——SFU, Vancouver
Rorqual4xH100-SXM5-80GB93 (27 MIG)0.0160——ETS, Montreal
UTIAS SERVERS historical data

GPU utilisation over time · last 14 days, % of GPUs in use, sampled each refresh

TRAIL GPU utilisation over time
External clusters historical data

Queue time, last observed vs 14d median · external clusters · ↖ most recent job (or longest current pending) · ↘ median over last 14 days

Combined queue time heatmap for external clusters

Historical Trends - 14d, queue times only shown for top 5 ...

Historical trends and daily GPU hours
External clusters live data
Pending jobs heatmap, all cluster users

Last updated 2026-06-27 04:00:10