I am currently pursuing a combined B.S./M.S./Ph.D. program in Computer Science at Northeastern University, presently in the master’s stage. My research focuses on reasoning and retrieval for large language models in legal scenarios, with broader interests spanning NLP, Information Retrieval, and LLM applications. I am studying at NEUIR Lab under the guidance of Associate Professor Zhenghao Liu, as well as at Tsinghua University, supervised by Chaojun Xiao.

🔥 News

2025.10: Our paper has been awarded the Best Paper Award at the 2025 International Conference on Advanced Data Mining and Applications (ADMA 2025)! 🏆

📝 Publications

* indicates equal contribution, and † indicates corresponding author.

ACL2026

Revealing the Attention Floating Mechanism in Masked Diffusion Models

Xin Dai, Pengcheng Huang, Zhenghao Liu†, Shuo Wang, Yukun Yan, Chaojun Xiao, Yu Gu, Ge Yu, Maosong Sun

📃Paper

💻Code

CCF-A

This work uncovers the attention floating mechanism in masked diffusion models, showing how attention anchors move dynamically across denoising steps and layers. The analysis provides a mechanistic explanation for their strong in-context learning behavior and strong performance on knowledge-intensive tasks.

ACL2026

SAFE-QAQ: End-to-End Slow-Thinking Audio-Text Fraud Detection via Reinforcement Learning

Peidong Wang*, Zhiming Ma*, Xin Dai*, Yongkang Liu, Shi Feng, Xiaocui Yang, Wenxing Hu, Zhihao Wang, Mingjun Pan, Li Yuan, Daling Wang

📃Paper

💻Code

CCF-A

This work presents an end-to-end audio-text fraud detection framework that directly models both speech signals and textual content with slow-thinking reinforcement learning. It captures fine-grained acoustic cues missed by text-only methods and supports more robust and real-time fraud detection.

ICASSP2026

LegalΔ: Enhancing Legal Reasoning in LLMs via Reinforcement Learning with Chain-of-Thought Guided Information Gain

Xin Dai*, Buqiang Xu*, Zhenghao Liu†, Yukun Yan, Huiyuan Xie, Xiaoyuan Yi, Shuo Wang, Ge Yu

📃Paper

💻Code

CCF-B

This work introduces a reinforcement learning framework for legal LLMs that explicitly encourages high-quality chain-of-thought reasoning through information gain. It improves both prediction accuracy and interpretability, making legal judgments more reliable and better grounded in multi-step reasoning.

ADMA2025

LegalDuet: Learning Fine-Grained Representations for Legal Judgment Prediction via a Dual-View Contrastive Learning

Buqiang Xu*, Xin Dai*, Zhenghao Liu†, Huiyuan Xie, Xiaoyuan Yi, Shuo Wang, Yukun Yan, Liner Yang, Yu Gu, Ge Yu

📃Paper

💻Code

CCF-C

This work proposes a dual-view contrastive learning framework to learn fine-grained representations for legal cases and judgments. By jointly modeling case similarity and decision-level distinctions, it improves legal judgment prediction on challenging and easily confusable cases.

🎖 Honors and Awards

2025.11 President Scholarship, First-Class Excellent Student Scholarship.
2024.11 National Scholarship, First-Class Excellent Student Scholarship, and Outstanding Student Award.
2023.11 National Scholarship, First-Class Excellent Student Scholarship, Outstanding Student Role Model.
2022.11 Second-Class Excellent Student Scholarship and Outstanding Student Award.

📖 Educations

2025.09 - now, M.S. School of Computer Science and Engineering, Northeastern University.
2021.09 - 2025.06, B.S. School of Computer Science and Engineering, Northeastern University.

💻 Internships

2025.10 - now, ModelBest, China.