Insights into the Challenges and Opportunities of Large Multi-Modal Models for Blind and Low Vision Users: CLIP
PARIKSHA: A Scalable, Democratic, Transparent Evaluation Platform for Assessing Indic Large Language Models
Publication Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition Sanyuan Chen, Yu Wu, Chengyi Wang, Shujie Liu, Zhuo Chen, Gang Liu, Jinyu Li, Jian Wu, Xiangzhan Yu, Furu Wei Interspeech | September 2022
Publication Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data Junyi Ao, Ziqiang Zhang, Long Zhou, Shujie Liu, Haizhou Li, Tom Ko, Lirong Dai, Jinyu Li, Yao Qian, Furu Wei Interspeech | September 2022
Publication Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition Zhong Meng, Yashesh Gaur, Naoyuki Kanda, Jinyu Li, Xie Chen, Yu Wu, Yifan Gong Interspeech 2022 | September 2022
Publication VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition Naoyuki Kanda, Jian Wu, Xiaofei Wang, Zhuo Chen, Jinyu Li, Takuya Yoshioka arXiv:2209.04974 | September 2022
Publication Towards Contextual Spelling Correction for Customization of End-to-End Speech Recognition Systems Xiaoqiang Wang, Yanqing Liu, Jinyu Li, Veljko Miljanic, Sheng Zhao, Hosam Khalil IEEE/ACM Transactions on Audio, Speech, and Language Processing | September 2022, Vol 30: pp. 3089-3097
Publication Leveraging Real Conversational Data for Multi-Channel Continuous Speech Separation Xiaofei Wang, Dongmei Wang, Naoyuki Kanda, Sefik Emre Eskimez, Takuya Yoshioka INTERSPEECH 2022 | September 2022
Publication What is it like to program with artificial intelligence? Advait Sarkar, Andy Gordon, Carina Negreanu, Christian Poelitz, Sruti Srinivasa Ragavan, Ben Zorn Proceedings of the 33rd Annual Conference of the Psychology of Programming Interest Group (PPIG 2022) | September 2022 Project Project
Publication Adapting Task-Oriented Dialogue Models for Email Conversations Soham Deshmukh, Charles Lee arXiv preprint arXiv:2208.09439 | August 2022 Project
Publication Fast Vocabulary Projection Method via Clustering for Multilingual Machine Translation on GPU Hossam Amer, Young Jin Kim, Mohamed Afify, Hitokazu Matsushita, Hany Hassan Awadalla AMTA | August 2022 Project
Publication Language Tokens: A Frustratingly Simple Approach Improves Zero-Shot Performance of Multilingual Translation Muhammad ElNokrashy, Amr Hendy, Mohamed Maher, Mohamed Afify, Hany Hassan Awadalla AMTA | August 2022 Project