新闻与深度文章
Investigating vulnerabilities in LLMs; A novel total-duration-aware (TDA) duration model for text-to-speech (TTS); Generative expert metric system through iterative prompt priming; Integrity protection in 5G fronthaul networks:
| Zinan Lin, Jinyu Li, Bhaskar Mitra, Siân Lindley, Liang Wang, Nan Yang, 和 Furu Wei
Mixture-of-linear-experts for long-term time series forecasting; Weakly-supervised streaming multilingual speech model with truly zero-shot capability; KBFormer: Diffusion model for structured entity completion; Identifying risks of AI-mediated data access:
Speech is a signal that can enable natural interaction between human and machine. In order to facilitate this exchange, machines have to be able to recognize what a human has spoken, both the words and the context in which those…
Speech recognition is something we humans do remarkably well, which includes our ability to understand speech even in noisy multi-talker environments. While our natural sophistication at this is something we take for granted, speech recognition researchers continue to pursue refinements…