Research Focus: Week of September 25, 2023
Chunked prefills & decode-maximal batching boost LLM inference; DragNUWA combines text, image, and trajectory for fine-grained video content control; reconstructing images from human brain signals; structural inequalities in creator-audience relationships.
Agent AI
Agent-based multimodal AI systems are becoming a ubiquitous presence in our everyday lives. A promising direction for making these systems more interactive is to embody them as agents within specific environments. The grounding of large…
HAMS-ALT
In the Harnessing AutoMobiles for Safety, or HAMS, project, we use low-cost sensing devices to construct a virtual harness for vehicles. The goal is to monitor the state of the driver and how the vehicle…
Frontiers of multimodal learning: A responsible AI approach
New evaluation methods and a commitment to continual improvement are musts if we’re to build multimodal AI systems that advance human goals. Learn about cutting-edge research into the responsible development and use of multimodal AI…