Yizheng's Homepage

I am an Assistant Professor in the Department of Computer Science at the University of Maryland, College Park. Previously, I was a postdoc at UC Berkeley and Columbia University. I received the PhD in Computer Science from Georgia Institute of Technology.

My research focuses on Large Language Models for Code Generation and AI for Security. My research group works on: enabling AI Coding Assistants to generate secure and correct code, benchmarking and building LLM agents to detect and patch security vulnerabilities, and the security and privacy issues of AI Agents. I’ve been honored with several awards, including the ACM CCS Best Paper Award Runner-up, the Google ASPIRE Award, the NSF CAREER award, and the Anita Borg Memorial Scholarship.

News

I am recruiting PhD students. To candidates: please fill out the questionnaire, and email me: yzchen [at] umd [dot] edu

Papers

SecRepoBench: Benchmarking Code Agents for Secure Code Completion in Real-World Repositories [ preprint | code | huggingface | website ]
Chihao Shen, Connor Dilgren, Purva Chiniya, Luke Griffith, Yu Ding, and Yizheng Chen.
In LLM4Code Workshop Co-located with ICSE (LLM4Code 2026)

Benchmarking Correctness and Security in Multi-Turn Code Generation [ preprint | dataset | website ]
Ruchit Rawal, Jeffrey Yang Fan Chiang, Chihao Shen, Jeffery Siyuan Tian, Aastha Mahajan, Tom Goldstein, and Yizheng Chen.
In NeurIPS 2025 Workshop on Multi-Turn Interactions in Large Language Models

Locus: Agentic Predicate Synthesis for Directed Fuzzing [ pdf ]
Jie Zhu, Chihao Shen, Ziyang Li, Jiahao Yu, Yizheng Chen, and Kexin Pei.
In proceedings of the IEEE/ACM International Conference on Software Engineering (ICSE 2026)

Vulnerability Detection with Code Language Models: How Far Are We? [ pdf | PrimeVul dataset ]
Yangruibo Ding, Yanjun Fu, Omniyyah Ibrahim, Chawin Sitawarin, Xinyun Chen, Basel Alomair, David Wagner, Baishakhi Ray, and Yizheng Chen.
In proceedings of the IEEE/ACM International Conference on Software Engineering (ICSE 2025)
* PrimeVul was used by Google DeepMind's Gemini 1.5 Pro for vulnerability detection evaluation

Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis [ pdf | website ]
Jeffrey Yang Fan Chiang*, Seungjae Lee*, Jia-Bin Huang, Furong Huang, and Yizheng Chen. (*Equal contribution)
In ICLR 2025 Workshop on Building Trust in LLMs and LLM Applications

ML-Based Behavioral Malware Detection Is Far From a Solved Problem [ pdf ]
Yigitcan Kaya, Yizheng Chen, Marcus Botacin, Shoumik Saha, Fabio Pierazzi, Lorenzo Cavallaro, David Wagner, and Tudor Dumitras.
In proceedings of the IEEE Conference on Secure and Trustworthy Machine Learning (SaTML 2025)

Continuous Learning for Android Malware Detection. [ pdf | code ]
Yizheng Chen, Zhoujie Ding, and David Wagner.
In proceedings of the 32nd USENIX Security Symposium (USENIX Security 2023)
* Top 10 Finalist of the CSAW'23 Applied Research Competition

DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection. [ pdf | dataset | metadata | label noise analysis ]
Yizheng Chen, Zhoujie Ding, Lamya Alowain, Xinyun Chen, and David Wagner.
In proceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2023)

Part-Based Models Improve Adversarial Robustness. [ pdf ]
Chawin Sitawarin, Kornrapat Pongmala, Yizheng Chen, Nicholas Carlini, and David Wagner.
In the Eleventh International Conference on Learning Representations (ICLR 2023)

SEAT: Similarity Encoder by Adversarial Training for Detecting Model Extraction Attack Queries. [ pdf ]
Zhanyuan Zhang, Yizheng Chen, and David Wagner.
In proceedings of the 14th ACM Workshop on Artificial Intelligence and Security (AISec 2021).

Learning Security Classifiers with Verified Global Robustness Properties. [ pdf | code | errata ]
Yizheng Chen, Shiqi Wang, Yue Qin, Xiaojing Liao, Suman Jana, and David Wagner.
In proceedings of the 28th ACM Conference on Computer and Communications Security (CCS 2021)
* Best Paper Award Runner-Up

Cost-Aware Robust Tree Ensembles for Security Applications. [ pdf | code | appendix ]
Yizheng Chen, Shiqi Wang, Weifan Jiang, Asaf Cidon, and Suman Jana.
In proceedings of the 30th USENIX Security Symposium (USENIX Security 2021)
Blog post: Robust Trees for Security (4 min read).

On Training Robust PDF Malware Classifiers. [ pdf | code ]
Yizheng Chen, Shiqi Wang, Dongdong She, and Suman Jana.
In proceedings of the 29th USENIX Security Symposium (USENIX Security 2020)
Blog post: Monotonic malware classifiers (5 min read), Gmail's malicious document classifier can still be trivially evaded (3 min read), How XGBoost enforces global monotonicity (2 min read).
More: MalGAN attack evaluation on robust PDF malware classifiers.

Neutaint: Efficient Dynamic Taint Analysis with Neural Networks. [ pdf ]
Dongdong She, Yizheng Chen, Abhishek Shah, Baishakhi Ray, and Suman Jana.
In proceedings of the 41st IEEE Symposium on Security and Privacy (S&P/Oakland 2020)

Enhancing Gradient-based Attacks with Symbolic Intervals. [ pdf | code ]
Shiqi Wang, Yizheng Chen, Ahmed Abdou, and Suman Jana.
In ICML Workshop on Security and Privacy of Machine Learning (SPML 2019).
Oral Presentation. Interval attacks appear on MadryLab MNIST Challenge Leaderboard.

FeatNet: Large-scale Fraud Device Detection by Network Representation Learning with Rich Features. [ pdf ]
Chao Xu, Zhentan Feng, Yizheng Chen, Minghua Wang, and Tao Wei.
In proceedings of the 11th ACM Workshop on Artificial Intelligence and Security (AISec 2018).

Practical Attacks Against Graph-based Clustering. [ pdf ]
Yizheng Chen, Yacin Nadji, Athanasios Kountouras, Fabian Monrose, Roberto Perdisci, Manos Antonakakis, and Nikolaos Vasiloglou.
In proceedings of the 24th ACM Conference on Computer and Communications Security (CCS 2017)
* Top 10 Finalist of the CSAW'17 Applied Research Competition

Hiding in Plain Sight: A Longitudinal Study of Combosquatting Abuse. [ pdf ]
Panagiotis Kintis, Najmeh Miramirkhani, Charles Lever, Yizheng Chen, Rosa Romero-Gómez, Nikolaos Pitropakis, Nick Nikiforakis, and Manos Antonakakis.
In proceedings of the 24th ACM Conference on Computer and Communications Security (CCS 2017)
News: Domain Name Wire, Georgia Tech, EurekAlert!, ZDNet, Domain Pulse, World Trademark Review, GIGALAW
Visualization: Combosquatting Clusters

Measuring Network Reputation in the Ad-Bidding Process. [ pdf ]
Yizheng Chen, Yacin Nadji, Rosa Romero-Gómez, Manos Antonakakis, and David Dagon.
In proceedings of the 14th Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA 2017)

Enabling Network Security Through Active DNS Datasets. [ pdf | data ]
Athanasios Kountouras, Panagiotis Kintis, Chaz Lever, Yizheng Chen, Yacin Nadji, David Dagon, Manos Antonakakis, and Rodney Joffe.
In proceedings of the 19th International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2016)
Dataset Contribution: Active DNS Dataset

Financial Lower Bounds of Online Advertising Abuse. [ pdf | TDSS-TDL4 Domains ]
Yizheng Chen, Panagiotis Kintis, Manos Antonakakis, Yacin Nadji, David Dagon, Wenke Lee, and Michael Farrell.
In proceedings of the 13th Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA 2016)

On the Feasibility of Large-Scale Infections of iOS Devices. [ pdf ]
Tielei Wang, Yeongjin Jang, Yizheng Chen, Pak-Ho Chung, Billy Lau, and Wenke Lee.
In proceedings of the 23rd USENIX Security Symposium (USENIX Security 2014)
News: The Register, Wired, Toms Guide, ComputerWorld, PCWorld

DNS Noise: Measuring the Pervasiveness of Disposable Domains in Modern DNS Traffic. [ pdf ]
Yizheng Chen, Manos Antonakakis, Roberto Perdisci, Yacin Nadji, David Dagon, and Wenke Lee.
In proceedings of the 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2014)