Selected Publications
Selected papers. * denotes equal contribution / co-first author. See Google Scholar for the full list.
|
|
|
Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs
Suyu Ge*, Yunan Zhang*, Liyuan Liu*, Minjia Zhang, Jiawei Han, Jianfeng Gao
ICLR, 2024 Outstanding Paper Honorable Mention
Paper
|
|
|
A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts
Suyu Ge*, Xihui Lin*, Yunan Zhang*, Jiawei Han, Hao Peng
arXiv preprint, 2024
Paper
|
|
|
S2-Attention: Hardware-Aware Context Sharding among Attention Heads
Xihui Lin*, Yunan Zhang*, Suyu Ge*, Liliang Ren, Barun Patra, Vishrav Chaudhary, Hao Peng, Xia Song
arXiv preprint, 2024
Paper
|
|
|
MART: Improving LLM Safety with Multi-round Automatic Red-Teaming
Suyu Ge, Chunting Zhou, Rui Hou, Madian Khabsa, Yi-Chia Wang, Qifan Wang, Jiawei Han, Yuning Mao
NAACL, 2024
Paper
|
|
|
Research Scientist,
May 2025 - Oct 2025
|
|
|
Research Intern,
Jan 2024 - Dec 2024
|
|
|
Research Intern,
May 2023 - Dec 2023
|
|
|
Research Intern,
May 2022 - Aug 2022
|
|
|
University of Illinois Urbana-Champaign
Ph.D. in Computer Science, 2021 - 2025
|
|
|
Tsinghua University
B.E. in Electronic Engineering, 2017 - 2021
|
|