Research Focus: Week of October 23, 2023
In this issue: Kosmos-2.5: A Multimodal Literate Model; Can vine copulas explain complex relationships of weather variables; New system accelerates the adaptive training process; Structural inequalities and relational labor in the influencer industry.
CCEdit
Creative and Controllable AI Video Editing Demo (opens in new tab) Paper (opens in new tab) CCEdit is a comprehensive generative video editing framework meticulously designed to strike a harmonious balance between controllability and creativity…
LLaVA: Large Language and Vision Assistant
LLaVA is an open-source project, collaborating with research community to advance the state-of-the-art in AI. LLaVA represents the first end-to-end trained large multimodal model (LMM) that achieves impressive chat capabilities mimicking spirits of the multimodal…
HoloAssist: A multimodal dataset for next-gen AI copilots for the physical world
HoloAssist is a new multimodal dataset consisting of 166 hours of interactive task executions with 222 participants. Discover how it offers invaluable data to advance the capabilities of next-gen AI copilots for real-world tasks.