The Comprehensive Review on Prompt Injection Attacks and Defense Mechanisms in Large Language Models
DOI:
https://doi.org/10.61173/390f5h97Keywords:
Large Language Models, Prompt Injection Attacks, Defense Mechanisms, GCG Algorithm, Semantic Manipulation, Resource Exploitation, Adaptive Defense, CybersecurityAbstract
This review analyzes prompt injection attacks in large language models (LLMs) from 2019 to 2025, addressing critical security challenges as models like ChatGPT proliferate across sectors. We synthesize advances in detection, classification, and mitigation strategies, proposing a tripartite framework categorizing attacks by vector (text/image/speech), mechanism (semantic manipulation, resource exploitation), and impact (data breaches, privacy theft). Key attack vectors include the GCG algorithm, DAN jailbreaks, and resource-exhaustion tactics (e.g., Engorgio). Current defenses are evaluated for efficacy, highlighting scalability gaps and trade-offs between security and model utility. Future priorities include adaptive defense systems leveraging reinforcement learning, interdisciplinary collaboration to address ethical-technical intersections, and open threat intelligence networks for proactive vulnerability management. This work equips researchers and practitioners with actionable strategies to secure LLM ecosystems against evolving adversarial threats.