Towards Secure Large Language Models: From Model to System
We are witnessing a paradigm shift in AI, transitioning from deep learning models to the era of Large Language Models. This shift signifies a transformative advancement in AI, enabling it to be applied to diverse real-world safety-critical applications. Despite these impressive achievements, a fundamental question remains: are LLMs truly ready for safe, and secure use?
In this talk, I will show how my research embeds a computer security mindset to answer the above question. To understand and build secure and safe Large Language Models (LLMs), my talk will go through two core system perspectives including (1) investigating the lifecycle of LLMs and (2) analyzing information flow of LLMs in the agentic system. I will discuss the way to develop principled red-teaming frameworks to systematically evaluate LLM safety. I will highlight why model-level alone is insufficient for securing LLMs and introduce the security vulnerabilities from the system perspective, as well as presenting the principled defense solutions for secure LLMs.