max_tokens
Keep outputs concise to preserve fast interaction loops.
A speed-focused reasoning model profile for interactive assistants and low-latency enterprise workflows.
Use Mercury 2 in your companyData checked: 2026-03-15
Mercury 2 is positioned as a very fast reasoning model, suitable for use cases where response time is a first-class requirement.
Keep outputs concise to preserve fast interaction loops.
Use lower values for reliable operational responses.
Tighter sampling often improves consistency at high volume.
Prefer predictable structure for real-time automation.
Use policy controls, role-based access, and budget guardrails before enabling advanced model tiers at scale.
Use Mercury 2 in your company