MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, adopting hybrid attention architecture. MiMo-V2-Flash supports a hybrid-thinking toggle and a 256K context window, and excels at reasoning, coding, and agent scenarios. On SWE-bench Verified and SWE-bench Multilingual, MiMo-V2-Flash ranks as the top #1 open-source model globally, delivering performance comparable to Claude Sonnet 4.5 while costing only about 3.5% as much. Note: when integrating with agentic tools such as Claude Code, Cline, or Roo Code, **turn off reasoning mode** for the best and fastest performance—this model is deeply optimized for this scenario. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config).
curl -X POST https://api.neuralhub.xyz/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "xiaomi/mimo-v2-flash:free",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "" }
],
"temperature": 0.7,
"max_tokens": 500,
"top_p": 0.9
}'{
"id": "chatcmpl-<uuid>",
"object": "chat.completion",
"created": 1768921569,
"model": "xiaomi/mimo-v2-flash:free",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The answer to life, the universe, and everything is famously 42..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 26,
"completion_tokens": 169,
"total_tokens": 195
}
}