llm-bench dashboard

Structure is fixed: capability = coding × comprehension; fleet = capability × speed_term. Dials only re-rank — nothing is written back.

Capability ranking

Adjust capability weights
comprehension · contribution
coding · contribution

Fleet suitability

Adjust fleet & speed weights
speed
fleet

Context size

Backend A/B — vulkan vs rocm

Δ = rocm relative to vulkan: green ⇒ vulkan faster, yellow ⇒ rocm faster. Vulkan (int-dot off) is the suite default. llama-bench -fa1 -ngl99 -ctk/v q8_0.

Per-model breakdown

Data sources / required runs