Fazle Faisal

Senior Research Software Engineer

About

I am a Research Software Engineer in the Systems and Networking research group at Microsoft Research. My research centers on Action Engines—an agentic AI stack for executing real-world tasks reliably and efficiently. The Action Engine frames agentic AI as a full system, integrating execution, learning, and training rather than treating agents as isolated LLM wrappers.

My work focuses on designing, training, and validating agentic systems that move beyond information access toward robust action execution, especially in complex environments such as the web.

Figure: The architecture of Action Engine (source: [2602.20502] ActionEngine: From Reactive to Programmatic GUI Agents via State Machine Memory (opens in new tab))

Action Engine: My Core Research Agenda

I design and develop the Action Engine as an umbrella framework that unifies multiple agentic projects into a single stack. In my work, Action Engine comprises three tightly integrated layers:

Agentic Execution Layer: The execution layer translates user intent into concrete, multi-step actions over real systems, especially the web. My focus is on structured action execution: enforcing correct action sequencing, grounding actions in system state, and validating outcomes during execution. This work expands the notion of web agents into general-purpose action engines capable of reliably carrying out tasks instead of merely generating responses.
Learning Layer: This layer learns an existing system before an agent acting on it. It extracts structured knowledge about large action spaces through systematic exploration, deduplication, and cross-state knowledge propagation. I develop agentic crawlers that uncover hidden functionality and build reusable representations that significantly improve agent efficiency and correctness.
Training Layer: I focus on training domain-specific LLMs optimized for action execution rather than general text generation. Instead of relying on model scale, I generate high-quality, trajectory-grounded supervision that captures realistic workflows and long-horizon dependencies. This enables smaller models to perform competitively when embedded within the Action Engine.

Projects within the Action Engine

Action Graph Crawler: I developed the Action Graph Crawler, a core project in the learning layer of the Action Engine. The Action Graph Crawler systematically explores websites and produces learned web interaction knowledge in the form of a state machine (action graph). Nodes represent interaction states, while edges encode available actions and transitions. This learned action graph captures valid workflows, branching structure, and reachable capabilities, and serves as a reusable knowledge substrate for downstream agents.
AutoSurfer: I developed AutoSurfer as a core component of the Action Engine learning and training stack. AutoSurfer performs systematic, breadth-first, human-like exploration of websites, explicitly handling hierarchical menus, dropdowns, and fixed versus dynamic GUI elements. It tightly couples exploration with task synthesis and trajectory refinement, producing low-hallucination task–trajectory pairs suitable for training website-specific action models.
Web Agents: I developed Web Agents as execution-layer systems built on top of Action Engine knowledge. They rely on Action Graph Crawler–generated state machines to guide execution efficiently and reliably. Learned action graphs constrain and steer agent behavior, reducing search, improving correctness, and enabling execution even with generic or externally provided planners.
WABER/WAREX: As supporting infrastructure, I took part in developing WABER and WAREX to evaluate efficiency and reliability of web agents using proxy-based fault injection on existing benchmarks. These projects perform as diagnostic tools that expose agent failures in unexpected environments.

Perspective and Background

My research advances a system-level view of agentic AI: learning how systems work, training models to act within them, and executing actions through explicit control and validation. This approach bridges LLM research with traditional systems concerns such as efficiency and reliability.

I received my Ph.D. from the University of Notre Dame in 2016, where my dissertation focused on algorithmic and machine learning methods for inferring knowledge from large real-world networks. I previously received a master’s degree from the University of Memphis. My current work extends this foundation toward building practical, trustworthy agentic AI systems.