Blake Bullwinkel posts

AI Safety Researcher

Blake is a researcher on the AI Red Team at Microsoft, where he studies safety and security vulnerabilities in generative AI systems. His work focuses on understanding and stress-testing model behavior under adversarial conditions, with the goal of mitigating near-term risks while informing the design of more robust AI systems in the long term.

Research

February 9

3 min read

A one-prompt attack that breaks LLM safety alignment

As LLMs and diffusion models power more applications, their safety alignment becomes critical.
Research

February 4

7 min read

Detecting backdoored language models at scale

We’re releasing new research on detecting backdoors in open-weight language models and highlighting a practical scanner designed to detect backdoored models at scale and improve overall trust in AI systems.
Research

January 13, 2025

6 min read

3 takeaways from red teaming 100 generative AI products

Since 2018, Microsoft’s AI Red Team has probed generative AI products for critical safety and security vulnerabilities.

A one-prompt attack that breaks LLM safety alignment

Detecting backdoored language models at scale

3 takeaways from red teaming 100 generative AI products

Get started with Microsoft Security