Recent Updates

On leave from NYU to participate in the MATS Research Fellowship, advised by Shi Feng and Jacob Pfau.
Paper on poisoning jailbreak detection models accepted to ICLR 2026 AIWILD Workshop.

I do research on AI safety and security. Reach out to me via email if you’d like to get in contact for discussion or collaboration. I’m always open to talk about research ideas.

I’m broadly interested in the security and robustness for deep learning models as well as understanding their capabilities for generalization in out of distribution settings. Right now, I’m particularly focused on the ability of LLMs to encode and interpret information that is subliminal to humans.

At NYU, I’m advised by Rico Angell and He He, working on jailbreak defenses and model interpretability. I’m also extremely fortunate for the mentorship of Chawin Sitawarin during my undergraduate studies at UC Berkeley as a part of David Wagner’s group.

I’m a fan of landscape photography. Here’s a randomly sampled photo from my portfolio, most of which were taken on my Canon EOS R50:

Publications

Stronger Universal and Transfer Attacks by Suppressing Refusals
Huang, D., Shah, A., Araujo, A., Wagner, D., & Sitawarin, C.
NAACL 2025 (abridged version accepted NeurIPS SafeGenAI 2024), 2025
A novel algorithm leveraging model refusal representation for automated jailbreaking suffix generation on LLMs

Efficient Mitigation of Bus Bunching through Setter-Based Curriculum Learning
Shah, A., Tran, D., & Tang, Y.
arxiv, 2023
Explores efficient solutions for transportation optimization via model based curriculum learning

Avidan Shah

Publications