Insights into the Challenges and Opportunities of Large Multi-Modal Models for Blind and Low Vision Users: CLIP
PARIKSHA: A Scalable, Democratic, Transparent Evaluation Platform for Assessing Indic Large Language Models
论文与出版物 Evaluation Validity in Information Retrieval Paul Thomas, Nick Craswell, Mark Sanderson, Seth Spielman, Robert Sim, Ryen W. White 2026 International ACM SIGIR Conference on Research and Development in Information Retrieval | July 2026
论文与出版物 Pushing the Limits of On-Device Streaming ASR: A Compact, High-Accuracy English Model for Low-Latency Inference Nenad Banfic, D. Fan, Kunal Vaishnavi, S. Kemp, Sunghoon Choi, Ruifeng Ren, Sun Shaw, Meng Tang April 2026
论文与出版物 Shuffle the Context: RoPE-Perturbed Self-Distillation for Long-Context Adaptation Zichong Li, Chen Liang, Liliang Ren, Tuo Zhao, Yelong Shen, Weizhu Chen April 2026
论文与出版物 Do Transformers Use their Depth Adaptively? Evidence from a Relational Reasoning Task A. Curth, Rachel Lawrence, Sushrut Karmalkar, Niranjani Prasad April 2026
论文与出版物 Discourse Diversity in Multi-Turn Empathic Dialogue Hongli Zhan, Emma S. Gueorguieva, Javier Hernandez, Jina Suh, Desmond C. Ong, Junyi Jessy Li April 2026 项目
论文与出版物 Evaluating Cooperation in LLM Social Groups through Elected Leadership Ryan Faulkner, Anushka Deshpande, David Guzman Piedrahita, Joel Z. Leibo, Zhijing Jin April 2026
论文与出版物 Litmus (Re)Agent: A Benchmark and Agentic System for Predictive Evaluation of Multilingual Models Avni Mittal, Shanu Kumar, Sandipan Dandapat, Monojit Choudhury April 2026
论文与出版物 Do LLMs Follow Their Own Rules? A Reflexive Audit of Self-Stated Safety Policies Avni Mittal April 2026
论文与出版物 Confident in a Confidence Score: Investigating the Sensitivity of Confidence Scores to Supervised Fine-Tuning Lorenzo Jaime Flores, Cesare Spinoso di-Piano, Jackie Cheung April 2026
论文与出版物 LLM Reasoning as Trajectories: Step-Specific Representation Geometry and Correctness Signals Lihao Sun, Hang Dong, Bo Qiao, Qingwei Lin, Dongmei Zhang, Saravan Rajmohan April 2026