
Detect

Building a Hybrid Rule-Based and Machine Learning Framework to Detect and Defend Against Jailbreak Prompts in LLM Systems
In this tutorial, we introduce a Jailbreak Defense that we built step-by-step to detect and safely handle policy-evasion prompts. We generate realistic attack and benign examples, craft rule-based signals, and combine those with TF-IDF features into a compact, interpretable classifier so we can catch evasive prompts without blocking legitimate requests. We demonstrate evaluation metrics, explain…

Meet ShadowLeak: ‘Impossible to detect’ data theft using AI
For years threat actors have used social engineering to trick employees into helping them steal corporate data. Now a cybersecurity firm has found a way to trick an AI agent or chatbot into bypassing its security protections. What’s new is that the exfiltration of the stolen data evades detection by going through the agent’s cloud…

LlamaFirewall: Open-source framework to detect and mitigate AI centric security risks – Help Net Security
LlamaFirewall is a system-level security framework for LLM-powered applications, built with a modular design to support layered, adaptive defense. It is designed to mitigate a wide spectrum of AI agent security risks including jailbreaking and indirect prompt injection, goal hijacking, and insecure code outputs. Why Meta created LlamaFirewall LLMs are moving far beyond simple chatbot…

3 Ways to Detect Fake AI Generated Videos Online
Similar to Deepware, you need to upload the video on the portal, after which the tool takes a while to analyse it. The tool offers an option where you can upload the URL of any video on the internet to scan it. Deepware AI is a free and powerful tool that instantly scans if a…