SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests
Using SocialReasoning Bench, we observed a stable pattern across models—agents execute competently, but fail to consistently improve the user’s position, even with explicit instructions to optimize for user interest.
Applied Data Scientist II
Our team builds the intelligence layer that powers Microsoft’s next‑generation threat detection ecosystem—spanning Vortex, Threat Graph, Verdict Net, and campaign‑correlation workflows. We combine deep applied science, graph‑theoretic reasoning, large‑scale machine‑learning, and multi‑modal security analytics to…
Language-Agnostic Detection of Bugs in Zero-Knowledge Proof Programs
Host: Greg Zaverucha, Microsoft ResearchSpeaker(s): Arman Kolozyan, Max Planck Institute for Security and Privacy Zero-knowledge proofs (ZKPs) allow a prover to convince a verifier of a statement’s truth without revealing any other information. In recent…
Principal Data and Applied Scientist – Engineering Operations
Are you a customer-obsessed, AI-curious problem-solver who thrives in an inclusive, collaborative global team? Join Engineering Operations (EngOps) – the organization driving operational excellence across the Microsoft Cloud to strengthen quality, reliability, security, and customer…
Red-teaming a network of agents: Understanding what breaks when AI agents interact at scale
Safe agents don’t guarantee a safe ecosystem of interconnected agents. Microsoft Research examines what breaks when AI agents interact and why network-level risks require new approaches.