Smaller reasoning models tend to generate longer, more redundant trajectories than larger ones, inflating token usage and inference cost. We proposed Mixed-Policy Distillation (MPD), where a teacher rewrites student-sampled trajectories into concise traces for KL-based alignment; on Qwen3-1.7B it cuts token usage by up to 27.1% while improving accuracy across multiple reasoning benchmarks.
@article{yang2026reasoning,title={Reasoning Compression with Mixed-Policy Distillation},author={Yang, Han and Wu, Mingyan and He, Bailan and Cao, Zeyu and Yan, Sikuan and Lin, Kevin Qinghong and Ding, Zifeng},journal={Preprint},year={2026},}
2025
Preprint
Deep Research Brings Deeper Harm
Shuo Chen, Zonggen Li, Zhen Han, and 7 more authors
Deep Research agents built on LLMs can decompose tasks, retrieve online information, and synthesize detailed reports — but this power amplifies misuse risk in high-stakes domains like biosecurity. We proposed two jailbreak strategies, Plan Injection and Intent Hijack, showing that alignment often fails in DR agents: multi-step planning weakens safeguards and yields more coherent, professional, and dangerous content than standalone LLMs.
@article{chen2025deepresearch,title={Deep Research Brings Deeper Harm},author={Chen, Shuo and Li, Zonggen and Han, Zhen and He, Bailan and Liu, Tong and Chen, Haokun and Groh, Georg and Torr, Philip and Tresp, Volker and Gu, Jindong},journal={Preprint},year={2025},}
Preprint
Distilling Tool Knowledge into Language Models via Back-Translated Traces
Xingyue Huang, Xianglong Hu, Zifeng Ding, and 9 more authors
Tool-integrated reasoning ensures correctness on math problems but adds inference-time dependencies that hinder deployment. We built a Solver Agent and a back-translation pipeline that converts interleaved tool-call traces into natural-language reasoning traces; fine-tuning a small open-source model on them internalizes tool knowledge and improves competition-level math performance without tool access at inference.
@article{huang2025distilling,title={Distilling Tool Knowledge into Language Models via Back-Translated Traces},author={Huang, Xingyue and Hu, Xianglong and Ding, Zifeng and He, Yuan and Rishabh and Alzarooni, Waleed and Ye, Ziyu and Fan, Wendong and He, Bailan and Bo, Haige and Hu, Changran and Li, Guohao},journal={Preprint},year={2025},}
Under Review
Bag of Tricks for Subverting Reasoning-based Safety Guardrails
Shuo Chen, Zhen Han, Bailan He, and 5 more authors
Deliberative Alignment improves LLM robustness to jailbreak attacks — but does it introduce new vulnerabilities? We designed five novel attacks targeting models with Deliberative Alignment: three bypassing reasoning and two exploiting reasoning itself, achieving over 80% attack success rate.
@article{chen2025bag,title={Bag of Tricks for Subverting Reasoning-based Safety Guardrails},author={Chen, Shuo and Han, Zhen and He, Bailan and Si, Shengyu and Wu, Jingpei and Torr, Philip and Tresp, Volker and Gu, Jindong},journal={Under Review},year={2025},}
COLM
Supposedly Equivalent Facts That Aren’t? Entity Frequency in Pre-training Induces Asymmetry in LLMs
Yuan He, Bailan He, Zifeng Ding, and 8 more authors
LLMs often produce asymmetric predictions for logically equivalent facts, leading to factual inconsistency and hallucination-like errors. We conducted a large-scale empirical study revealing that pre-training entity frequency distribution induces systematic bias in model predictions, identifying a root cause of factual inconsistency.
@article{he2025asymmetry,title={Supposedly Equivalent Facts That Aren't? Entity Frequency in Pre-training Induces Asymmetry in LLMs},author={He, Yuan and He, Bailan and Ding, Zifeng and Lupidi, Alisia and Zhu, Yuqicheng and Chen, Shuo and Zhang, Caiqi and Chen, Jiaoyan and Ma, Yunpu and Tresp, Volker and others},year={2025},journal={Conference on Language Modeling}}
WACV
Can Multimodal Large Language Models Truly Perform Multimodal In-Context Learning?
Shuo Chen, Zhen Han, Bailan He, and 4 more authors
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025
MLLMs’ ability to leverage visual context for few-shot adaptation was not well understood. We conducted large-scale analysis of modality contributions and co-developed MMICES for mixed-modality demo selection, improving model accuracy on downstream tasks with more efficient demo selection strategies.
@article{chen2024multimodal,title={Can Multimodal Large Language Models Truly Perform Multimodal In-Context Learning?},author={Chen, Shuo and Han, Zhen and He, Bailan and Buckley, Mark and Torr, Philip and Tresp, Volker and Gu, Jindong},journal={IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},year={2025},}
2024
ICLR
Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks?
Shuo Chen, Zhen Han, Bailan He, and 5 more authors
International Conference on Learning Representations (ICLR), 2024
MLLMs like GPT-4V lacked systematic evaluation against multimodal jailbreak attacks, especially those combining text and images. We developed a red-teaming benchmark with 1,445 harmful prompts across 11 safety categories covering uni- and multimodal attacks, benchmarking 11 proprietary and open-source models.
@article{chen2024red,title={Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks?},author={Chen, Shuo and Han, Zhen and He, Bailan and Ding, Zifeng and Yu, Wenqian and Torr, Philip and Tresp, Volker and Gu, Jindong},journal={International Conference on Learning Representations (ICLR)},year={2024},}
2023
ISWC
Forecasttkgquestions: A benchmark for temporal question answering and forecasting over temporal knowledge graphs
Zifeng Ding, Zongyue Li, Ruoxia Qi, and 8 more authors
Existing TKGQA benchmarks assume access to full temporal KG including future facts; they cannot evaluate forecasting capabilities. We proposed a new task "forecasting TKGQA", created a large-scale benchmark with entity prediction, yes/unknown, and fact reasoning questions.
@article{ding2023forecast,title={Forecasttkgquestions: A benchmark for temporal question answering and forecasting over temporal knowledge graphs},author={Ding, Zifeng and Li, Zongyue and Qi, Ruoxia and Wu, Jingpei and He, Bailan and Ma, Yunpu and Meng, Zhao and Chen, Shuo and Liao, Ruotong and Han, Zhen and others},journal={International Semantic Web Conference},pages={541--560},year={2023},}
IJCNN
Learning meta-representations of one-shot relations for temporal knowledge graph link prediction
Zifeng Ding, Bailan He, Jingpei Wu, and 3 more authors
Dynamic knowledge graphs require robust reasoning under temporal evolution, few-shot relations, and forward-looking tasks. We co-developed a lightweight temporal graph encoder, proposed concept-aware few-shot inductive methods, and meta-representations for one-shot relations.
@article{ding2023meta,title={Learning meta-representations of one-shot relations for temporal knowledge graph link prediction},author={Ding, Zifeng and He, Bailan and Wu, Jingpei and Ma, Yunpu and Han, Zhen and Tresp, Volker},booktitle={2023 International joint conference on neural networks (IJCNN)},pages={1--10},year={2023},organization={IEEE},}