Detect

Building a Hybrid Rule-Based and Machine Learning Framework to Detect and Defend Against Jailbreak Prompts in LLM Systems

In this tutorial, we introduce a Jailbreak Defense that we built step-by-step to detect and safely handle policy-evasion prompts. We generate realistic attack and benign examples, craft rule-based signals, and combine those with TF-IDF features into a compact, interpretable classifier so we can catch evasive prompts without blocking legitimate requests. We demonstrate evaluation metrics, explain…

Meet ShadowLeak: ‘Impossible to detect’ data theft using AI

ellonjohns6 days ago015 mins

For years threat actors have used social engineering to trick employees into helping them steal corporate data. Now a cybersecurity firm has found a way to trick an AI agent or chatbot into bypassing its security protections. What’s new is that the exfiltration of the stolen data evades detection by going through the agent’s cloud…

LlamaFirewall: Open-source framework to detect and mitigate AI centric security risks – Help Net Security

ellonjohns4 months ago09 mins

LlamaFirewall is a system-level security framework for LLM-powered applications, built with a modular design to support layered, adaptive defense. It is designed to mitigate a wide spectrum of AI agent security risks including jailbreaking and indirect prompt injection, goal hijacking, and insecure code outputs. Why Meta created LlamaFirewall LLMs are moving far beyond simple chatbot…

3 Ways to Detect Fake AI Generated Videos Online

ellonjohns7 months ago013 mins

Similar to Deepware, you need to upload the video on the portal, after which the tool takes a while to analyse it. The tool offers an option where you can upload the URL of any video on the internet to scan it. Deepware AI is a free and powerful tool that instantly scans if a…

Highlights

Raspberry Pi 500+ Review: RGB clicky keys and NVMe storage, but with a $200 price tag

Power Tips #145: EIS applications for EV batteries

ExpressVPN review 2025: Fast speeds and a low learning curve

AI system learns from many types of scientific information and runs experiments to discover new materials

Category Collection

Building a Hybrid Rule-Based and Machine Learning Framework to Detect and Defend Against Jailbreak Prompts in LLM Systems

Meet ShadowLeak: ‘Impossible to detect’ data theft using AI

LlamaFirewall: Open-source framework to detect and mitigate AI centric security risks – Help Net Security

3 Ways to Detect Fake AI Generated Videos Online