公開日 Scaling Agentic Capabilities, Not Context: Efficient Reinforcement Finetuning for Large Toolspaces Karan Gupta, Pranav Vajreshwari, Yash Pandya, Raghav Magazine, Akshay Nambi, Ahmed Awadallah ICLR Agents in the Wild | March 2026
公開日 Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use Aradhye Agarwal, Gurdit Siyan, Yash Pandya, Joykirat Singh, Akshay Nambi, Ahmed Awadallah ICLR Agents in the Wild: Safety, Security, and Beyond | March 2026
Microsoft Research ブログ Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model 3月 4, 2026 | Jyoti Aneja, Michael Harrison, Neel Joshi, Tyler LaBonte, John Langford, Eduardo Salinas
公開日 Phi-4-reasoning-vision-15B Technical Report Jyoti Aneja, Michael Harrison, Neel Joshi, Tyler LaBonte, John Langford, Eduardo Salinas MSR-TR-2026-10 | March 2026 Microsoft Research による投稿
公開日 Wavelet Predictive Representations for Non-Stationary Reinforcement Learning Min Wang, Xin Li, Ye He, Yao-Hui Li, Hasnaa Bennis, Riashat Islam, Mingzhong Wang ICLR 2026 | October 2025
キャリアの機会 Research Intern – Foundations of GenAI Posted: January 12, 2026 場所: New York, NY, US 研究分野: Artificial intelligence