CMSC818I: Advanced Topics in Computer Systems; Large Language Models, Security, and Privacy

Classroom: CSI 1121 Class hours: Tuesday and Thursday, 3:30pm - 4:45pm

Instructor: Yizheng Chen Email: yzchen@umd.edu Office Hours: Tuesday 2pm - 3pm, in IRB 5224

TA: Yanjun Fu Email: yanjunfu@umd.edu Office Hours: Thursday 2pm - 3pm, in IRB 5112 Desk #22

Lectures

Date	Topic	Materials
08/29	Introduction	Syllabus
08/31	Large Language Models for Vulnerability Detection	DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection [ Slides ] CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation NatGen: Generative pre-training by “Naturalizing” source code
09/05	Prompt Injection Attacks 1	Black Box Adversarial Prompting for Foundation Models [ Slides ]
09/07	Prompt Injection Attacks 2	Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection [ Slides ] Universal and Transferable Adversarial Attacks on Aligned Language Models [ Website \| Slides ]
09/12	Robustness Evaluation of Large Language Models 1	Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models [ Website \| Slides ] Robustness Over Time: Understanding Adversarial Examples' Effectiveness on Longitudinal Versions of Large Language Models [ Slides ] PromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts [ Optional ] How Is ChatGPT’s Behavior Changing over Time? [ Optional ]
09/14	Robustness Evaluation of Large Language Models 2	ReCode: Robustness Evaluation of Code Generation Models [ Slides ] Certifying LLM Safety against Adversarial Prompting [ Slides ] Baseline Defenses for Adversarial Attacks Against Aligned Language Models [ Optional ]
09/19	Security of Code Generation Models	The Base-Rate Fallacy and the Difficulty of Intrusion Detection [ Slides ] Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions [ Slides ]
09/21	Watermarking 1	A Watermark for Large Language Models [ Slides ] Can AI-Generated Text be Reliably Detected? [ Slides ] On the Possibilities of AI-Generated Text Detection
09/26	Secure Code Generation	Large Language Models for Code: Security Hardening and Adversarial Testing [ Slides ] SecurityEval dataset: mining vulnerability examples to evaluate machine learning-based code generation techniques [ Slides ]
09/28	Adversarial Robustness of Pre-trained Models	How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness? [ Slides ] Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution [ Slides ]
10/03	LLM for Binary Analysis	Palmtree: Learning an assembly language model for instruction embedding Trex: Learning Execution Semantics from Micro-Traces for Binary Similarity [ Slides ]
10/05	Poisoning	TrojanPuzzle: Covertly Poisoning Code-Suggestion Models [ Slides ] On the Exploitability of Instruction Tuning [ Slides ]
10/10	Training Data Extraction	Extracting Training Data from Large Language Models [ Slides ] Quantifying Memorization Across Neural Language Models [ Slides ] CodexLeaks: Privacy Leaks from Code Generation Language Models in GitHub Copilot
10/12	Privacy of Large Language Models	Provably Confidential Language Modeling [ Slides \| TA's ML Privacy Slides ] Just fine-tune twice: Selective differential privacy for large language models [ Slides ] Deep Learning with Differential Privacy [ Optional ]
10/17	Take-home Exam
10/19	Coding Assistants and Users	Do Users Write More Insecure Code with AI Assistants? [ Slides ] Lost at C: A User Study on the Security Implications of Large Language Model Code Assistants
10/24	Zero-Shot Capabilities (TA)	Pop Quiz! Can a Large Language Model Help With Reverse Engineering? [ Slides ] Examining Zero-Shot Vulnerability Repair with Large Language Models [ Slides ]
10/26	Project Day (TA)
10/31	Project Day (TA)
11/02	Watermarking 2	MGTBench: Benchmarking Machine-Generated Text Detection [ Slides ] Evading watermark based detection of AI generated content [ Slides ]
11/07	Security Risks of Generative AI	Identifying and Mitigating the Security Risks of Generative AI [ Slides ] LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI’s ChatGPT Plugins [ Slides ] Mid-term project report due
11/09	LLM for Software Security	Augmenting Greybox Fuzzing with Generative AI [ Slides ] Fuzzing deep-learning libraries via large language models [ Slides ] The FormAI Dataset: Generative AI in Software Security Through the Lens of Formal Verification [ Slides ]
11/14	Backdoor and Copyright	Backdooring Neural Code Search [ Slides ] Glaze: Protecting Artists from Style Mimicry by Text-to-Image Models [ Class Slides \| Author's Slides ]
11/16	Defense against Backdoors	IMBERT: Making BERT Immune to Insertion-based Backdoor Attacks [ Slides ] ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP [ Slides ] GPTs Don’t Keep Secrets: Searching for Backdoor Watermark Triggers in Autoregressive Language Models
11/21	Thanksgiving Break
11/23	Thanksgiving Break
11/28	Toxic Content Detection	You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content [ Slides ]
11/30	LLM for Vulnerability Repair	VulRepair: A T5-Based Automated Software Vulnerability Repair [ Slides ] Fixing Hardware Security Bugs with Large Language Models
12/05	Cybercrime	“Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models [ Slides ] Devising and Detecting Phishing: large language models vs. Smaller Human Models [ Slides ]
12/07	LLM for Network Security	Can Language Models Help in System Security? Investigating Log Anomaly Detection using BERT [ Slides ] ET-BERT: A Contextualized Datagram Representation with Pre-training Transformers for Encrypted Traffic Classification [ Slides ]
12/12	Reading Day No Class
12/14	No Class. Final project report due.