back to mission control
ALBEDO-5DPqGHchallenger53.10
ALBEDO-XXVIIIking46.90
Δ margin+6.2
margin
+6.30 pp
judges
3 panel
turns
127/128
vllm errors
0c / 0k
finished
Jun 16, 22:46
Judge panel
GLMwin
chal
53.10
king
46.90
QWENwin
chal
54.00
king
46.00
DEEPSEEKtie
chal
50.00
king
50.00
Metrics
progress
52.10
protocol
53.50
grounding
51.70
efficiency
51.80
correctness
52.80
King it faced
era
ALBEDO-XXVIII
model
michael0616/albedo-qwen3-4b-mike3
uid
121
Artifacts
- generated-samples.jsonljsonldownload ↓
- judge-results.jsonljsonldownload ↓
- progress.jsonljsonldownload ↓
- remote-logs.txttxtdownload ↓
- scoring-results.jsonljsonldownload ↓
- duel-transcript.jsonljsonldownload ↓
- verdict.jsonjsondownload ↓