Publication Optimizing Large Language Model Training Using FP4 Quantization Ruizhe Wang, Yeyun Gong, Xiao Liu, Guoshuai Zhao, Ziyue Yang, Baining Guo, Zhengjun Zha, Peng Cheng ICML 2025 | January 2025
Publication Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key Zhihe Yang, Xufang Luo, Dongqi Han, Yunjian Xu, Dongsheng Li 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) | January 2025, pp. 10610-10620
Publication rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Xinyu Guan, Li Lyna Zhang, Yifei Liu, Ning Shang, Youran Sun, Yi Zhu, Fan Yang, Mao Yang ICML 2025 (oral) | January 2025 Github
Publication EpiCoder: Encompassing Diversity and Complexity in Code Generation Yaoxiang Wang, Haoling Li, Xin Zhang, Jie Wu, Xiao Liu, Wenxiang Hu, Zhongxin Guo, Yangyu Huang, Ying Xin, Yujiu Yang, Jinsong Su, Qi Chen, Scarlett Li ICML 2025 | January 2025
Publication TimeDP: Learning to Generate Multi-Domain Time Series with Domain Prompts Yu-Hao Huang, Chang Xu, Yueying Wu, Jiang Bian January 2025 January 2025
Publication TimeRAF: Retrieval-Augmented Foundation model for Zero-shot Time Series Forecasting Huanyu Zhang, Chang Xu, Yi-Fan Zhang, Zhang Zhang, Liang Wang, Tien-Ping Tan, Jiang Bian ArXiv | December 2024, Vol abs/2412.20810
Publication Bootstrap Your Own Context Length Liang Wang, Nan Yang, Xingxing Zhang, Xiaolong Huang, Furu Wei ArXiv | December 2024, Vol abs/2412.18860
Publication SCBench: A KV Cache-Centric Analysis of Long-Context Methods Yucheng Li, Huiqiang Jiang, Qianhui Wu, Xufang Luo, Surin Ahn, Chengruidong Zhang, Amir H. Abdi, Dongsheng Li, Jianfeng Gao, Yuqing Yang, Lili Qiu ICLR 2025 | December 2024 Github Project
Publication InvDiff: Invariant Guidance for Bias Mitigation in Diffusion Models Min Hou, Yueying Wu, Chang Xu, Yu-Hao Huang, Chenxi Bai, Le Wu, Jiang Bian Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.1 | December 2024
Publication ElasTST: Towards Robust Varied-Horizon Forecasting with Elastic Time-Series Transformer Shun Zheng, Xumeng Wen, Jiang Bian 2024 Neural Information Processing Systems | December 2024