New Technical Defenses and Data Leakage Risks
Key Questions
What is Hirundo Gemma 4 and how does it improve AI security?
Hirundo Gemma 4 is a 4B-parameter model using weight unlearning to resist prompt injection. It outperforms much larger models according to DeepMind evaluations on security benchmarks.
Why is prompt injection considered inherent to LLMs?
Prompt injection exploits the fundamental way language models process instructions and context. This means architectural and runtime defenses are required rather than simple patching.
How prevalent is sensitive data leakage in ChatGPT prompts?
Research found that 4.37% of prompts to ChatGPT contain sensitive financial data. This highlights user behavior risks when interacting with AI tools for personal tasks.
What does the TELUS study reveal about AI attack surfaces?
The TELUS study subjected 34 AI models to 620,000 adversarial attacks to quantify exposure. Results provide data-driven guidance on closing common AI vulnerabilities.
How does explainable AI help address cybersecurity trust gaps?
Explainable AI techniques clarify model decisions, helping organizations bridge the gap between AI capabilities and verifiable security outcomes. Nearly half of organizations report related trust concerns.
Hirundo Gemma 4 (4B) with weight unlearning resists prompt injection, beats larger models per DeepMind. Prompt injection inherent to LLMs, requiring architectural/runtime defenses. 4.37% ChatGPT prompts leak sensitive financial data. TELUS 620k-attack study quantifies AI surfaces.