09/02 |
Introduction |
Syllabus |
09/04 |
Benchmarking Secure Code Generation |
SecRepoBench: Benchmarking LLMs for Secure Code Generation in Real-World Repositories |
09/09 |
Prompt Injection |
Universal and Transferable Adversarial Attacks on Aligned Language Models
[Slides]
|
09/11 |
Prompt Injection Defense |
Defeating Prompt Injections by Design [Slides]
|
09/16 |
LLM for Vulnerability Detection |
IRIS: LLM-Assisted Static Analysis for Detecting Security Vulnerabilities
[Slides]
|
09/18 |
LLM for Vulnerability Detection |
SV-TrustEval-C: Evaluating Structure and Semantic Reasoning in Large Language Models for Source Code Vulnerability Analysis
[Slides]
|
09/23 |
Data Leakage in LLM Evaluation |
LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks
[Slides]
|
09/25 |
Patch Memorization in LLM-Based Program Repair |
Demystifying Memorization in LLM-Based Program Repair via a General Hypothesis Testing Framework
[Slides]
|
09/30 |
LLM for PoC Generation |
CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at Scale
[Slides]
|
10/02 |
Reinforcement Learning for Code Generation |
Training Language Models to Generate Quality Code with Program Analysis Feedback
[Slides]
|
10/07 |
Poisoning AI Coding Assistants |
XOXO: Stealthy Cross-Origin Context Poisoning Attacks against AI Coding Assistants
[Slides]
|
10/09 |
Mid-term Exam |
|
10/14 |
Fall Break |
|
10/16 |
Project Day |
|
10/21 |
Mid-term Project Report Due |
|
10/23 |
Secure Code Generation with Reasoning Models |
SCGAgent: Recreating the Benefits of Reasoning Models for Secure Code Generation with Agentic Workflows
[Slides]
|
10/28 |
LLM for Program Analysis |
The Hitchhiker's Guide to Program Analysis, Part II: Deep Thoughts by LLMs
[Slides]
|
10/30 |
Reinforcement Learning for Vulnerability Reasoning |
R2Vul: Learning to Reason about Software Vulnerabilities with Reinforcement Learning and Structured Reasoning Distillation
[Slides]
|
11/04 |
Security Evaluation of LLM Agents |
AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents
[Slides]
|
11/06 |
Security Policy for LLM Agents |
Progent: Programmable Privilege Control for LLM Agents
[Slides]
|
11/11 |
Multi-Turn Code Generation against Malicious Prompts
|
MOCHA: Are Code Language Models Robust Against Multi-Turn Malicious Coding Prompts?
[Slides]
|
11/13 |
Privacy Leakage of AI Agents |
AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents
[Slides]
|
11/18 |
Pen Testing |
PentestGPT: An LLM-empowered Automatic Penetration Testing Tool
[Slides]
|
11/20 |
LLM Tailored Attacks |
LLMs unlock new paths to monetizing exploits
[Slides]
|
11/25 |
Thanksgiving Break |
|
11/27 |
Thanksgiving Break |
|
12/02 |
LLM for Network Attacks |
On the Feasibility of Using LLMs to Autonomously Execute Multi-host Network Attacks
[Slides]
|
12/04 |
LLM Code Reasoning |
CORE: Benchmarking LLMs Code Reasoning Capabilities through Static Analysis Tasks
[Slides]
|
12/09 |
Project Lightning Talks |
|
12/11 |
Project Lightning Talks |
|
12/16 |
Final Project Report Due |
|