Bailan He

I am a third-year PhD student in Computer Science at Ludwig Maximilian University of Munich (LMU Munich) and Siemens AG, jointly supervised by Prof. Dr. Volker Tresp and Dr. Yushan Liu. My research centers on developing trustworthy, interpretable, and knowledge-aware generative foundation models, with particular emphasis on vision-language modeling, model safety & red-teaming, and hallucination. Prior to my PhD, I earned an M.Sc. in Statistics from LMU Munich and a B.Sc. in Statistics from the Southwestern University of Finance and Economics, China. I am affiliated with TRESP Lab, MCML, and relAI.

Find me also in:

🔥 Always Hiring!
Always actively seeking motivated students for both research and master thesis projects. If you're interested in working on the topic of LLMs/MLLMs/Agents/Safety/Diffusion Model, feel free to email me your CV and transcript. Previous supervisions have led to high-quality publications in top venues such as ICLR, COLM, and ACL.
Contact: bailan.he.de@gmail.com

news

09 / 2025 🏆 We are the winners of the Red‑Teaming Challenge hosted by and (Top 0.3% among a total of 5911 international participants)! Stay tuned for our detailed reports!

07 / 2025 🎉 One papers got accepted at COLM 2025! The topic is on Fact Asymmetry. Congrats to all co-authors!

selected publications

Under Review
Bag of Tricks for Subverting Reasoning-based Safety Guardrails

Shuo Chen, Zhen Han, Bailan He, and 5 more authors

Under Review, 2025

Abs Bib

Deliberative Alignment improves LLM robustness to jailbreak attacks — but does it introduce new vulnerabilities? We designed five novel attacks targeting models with Deliberative Alignment: three bypassing reasoning and two exploiting reasoning itself, achieving over 80% attack success rate.
@article{chen2025bag, title = {Bag of Tricks for Subverting Reasoning-based Safety Guardrails}, author = {Chen, Shuo and Han, Zhen and He, Bailan and Si, Shengyu and Wu, Jingpei and Torr, Philip and Tresp, Volker and Gu, Jindong}, journal = {Under Review}, year = {2025}, }
COLM
Supposedly Equivalent Facts That Aren’t? Entity Frequency in Pre-training Induces Asymmetry in LLMs

Yuan He, Bailan He, Zifeng Ding, and 8 more authors

Conference on Language Modeling, 2025

Abs Bib PDF

LLMs often produce asymmetric predictions for logically equivalent facts, leading to factual inconsistency and hallucination-like errors. We conducted a large-scale empirical study revealing that pre-training entity frequency distribution induces systematic bias in model predictions, identifying a root cause of factual inconsistency.
@article{he2025asymmetry, title = {Supposedly Equivalent Facts That Aren't? Entity Frequency in Pre-training Induces Asymmetry in LLMs}, author = {He, Yuan and He, Bailan and Ding, Zifeng and Lupidi, Alisia and Zhu, Yuqicheng and Chen, Shuo and Zhang, Caiqi and Chen, Jiaoyan and Ma, Yunpu and Tresp, Volker and others}, year = {2025}, journal = {Conference on Language Modeling} }
WACV
Can Multimodal Large Language Models Truly Perform Multimodal In-Context Learning?

Shuo Chen, Zhen Han, Bailan He, and 4 more authors

IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025

Abs Bib PDF Code Website

MLLMs’ ability to leverage visual context for few-shot adaptation was not well understood. We conducted large-scale analysis of modality contributions and co-developed MMICES for mixed-modality demo selection, improving model accuracy on downstream tasks with more efficient demo selection strategies.
@article{chen2024multimodal, title = {Can Multimodal Large Language Models Truly Perform Multimodal In-Context Learning?}, author = {Chen, Shuo and Han, Zhen and He, Bailan and Buckley, Mark and Torr, Philip and Tresp, Volker and Gu, Jindong}, journal = {IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, year = {2025}, }
ICLR
Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks?

Shuo Chen, Zhen Han, Bailan He, and 5 more authors

International Conference on Learning Representations (ICLR), 2024

Abs Bib PDF Code

MLLMs like GPT-4V lacked systematic evaluation against multimodal jailbreak attacks, especially those combining text and images. We developed a red-teaming benchmark with 1,445 harmful prompts across 11 safety categories covering uni- and multimodal attacks, benchmarking 11 proprietary and open-source models.
@article{chen2024red, title = {Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks?}, author = {Chen, Shuo and Han, Zhen and He, Bailan and Ding, Zifeng and Yu, Wenqian and Torr, Philip and Tresp, Volker and Gu, Jindong}, journal = {International Conference on Learning Representations (ICLR)}, year = {2024}, }
ISWC
Forecasttkgquestions: A benchmark for temporal question answering and forecasting over temporal knowledge graphs

Zifeng Ding, Zongyue Li, Ruoxia Qi, and 8 more authors

International Semantic Web Conference, 2023

Abs Bib PDF

Existing TKGQA benchmarks assume access to full temporal KG including future facts; they cannot evaluate forecasting capabilities. We proposed a new task "forecasting TKGQA", created a large-scale benchmark with entity prediction, yes/unknown, and fact reasoning questions.
@article{ding2023forecast, title = {Forecasttkgquestions: A benchmark for temporal question answering and forecasting over temporal knowledge graphs}, author = {Ding, Zifeng and Li, Zongyue and Qi, Ruoxia and Wu, Jingpei and He, Bailan and Ma, Yunpu and Meng, Zhao and Chen, Shuo and Liao, Ruotong and Han, Zhen and others}, journal = {International Semantic Web Conference}, pages = {541--560}, year = {2023}, }
IJCNN
Learning meta-representations of one-shot relations for temporal knowledge graph link prediction

Zifeng Ding, Bailan He, Jingpei Wu, and 3 more authors

2023

Abs Bib PDF

Dynamic knowledge graphs require robust reasoning under temporal evolution, few-shot relations, and forward-looking tasks. We co-developed a lightweight temporal graph encoder, proposed concept-aware few-shot inductive methods, and meta-representations for one-shot relations.
@article{ding2023meta, title = {Learning meta-representations of one-shot relations for temporal knowledge graph link prediction}, author = {Ding, Zifeng and He, Bailan and Wu, Jingpei and Ma, Yunpu and Han, Zhen and Tresp, Volker}, booktitle = {2023 International joint conference on neural networks (IJCNN)}, pages = {1--10}, year = {2023}, organization = {IEEE}, }