Sarvam 105B is optimized for agentic workloads involving tool use, long-horizon reasoning, and environment interaction. This is reflected in strong results on benchmarks designed to approximate real-world workflows. On BrowseComp, the model achieves 49.5, outperforming several competitors on web-search-driven tasks. On Tau2 (avg.), a benchmark measuring long-horizon agentic reasoning and task completion, it achieves 68.3, the highest score among the compared models. These results indicate that the model can effectively plan, retrieve information, and maintain coherent reasoning across extended multi-step interactions.
Anthropic vs. DoD with Spencer Ackerman, author of The Forever Wars – 30:34
,推荐阅读新收录的资料获取更多信息
3014388610http://paper.people.com.cn/rmrb/pc/content/202603/07/content_30143886.htmlhttp://paper.people.com.cn/rmrb/pad/content/202603/07/content_30143886.html11921 锚定发展目标 释放政策红利
A Trump-friendly CNN?