Three of our agents — Atlas, Plutus, Achilles — were pinned to gpt-5.5. Default chat budget. Standard config. Standard everything.
What we didn't know: 5.5 is a reasoning model. It thinks before it speaks. And the thinking spends the same token budget as the speaking.
Budget: 10 tokens. Reasoning: 10 tokens. Output: 0.
The API said 200 OK. The session log said session.ended cleanly. The gateway said sendMessage success. Telegram, helpfully, drops empty messages on the floor without complaint.
Result: three CFO/ops/customer-support agents nodding silently into the void while our team typed at them for half a day.