Hardware tiers
- Floor96 GB VRAM · 3 concurrente.g. 1x RTX 6000 Pro Blackwell — comfortably hosts both GPT-OSS 120B and GPT-OSS 20B at the same time, routing tasks between them. The higher GPU memory bandwidth also helps keep interactive workloads feeling faster. Fits a single analyst or a small shared team.
- Team192–320 GB · 4–10 concurrente.g. 2× RTX 6000 Pro Blackwell, or 2–4× H100. The added memory bandwidth gives you more headroom for longer context windows under load and for background verification to run alongside interactive chat.
- Fleet640 GB+ · 20+ concurrentSingle dense node — e.g. 8× H100 or H200. Sized for headless agent pipelines running alongside interactive analysts: long contexts, long histories, batch triage at throughput.