Introduction to Sentinel
Safety for AI that Acts: From Chatbots to RobotsSentinel is an AI safety framework that protects across three surfaces: LLMs (text safety), Agents (action safety), and Robots (physical safety). One framework, three attack surfaces.
The Problem
AI systems are increasingly autonomous, making decisions that affect the real world:
- LLMs: Chatbots, assistants, customer service
- Agents: Autonomous code execution, tool-use, trading
- Robots: LLM-powered robots, industrial systems, drones
Without proper safety measures, these systems are vulnerable to prompt injection, jailbreaking, data exfiltration, and unintended harmful actions.
The Solution: THSP Protocol
Sentinel implements the THSP protocol, a four-gate validation system:
| Gate | Question | Failure Condition |
|---|---|---|
| Truth | Does this involve deception? | Creating/spreading false information |
| Harm | Could this cause damage? | Physical, psychological, financial harm |
| Scope | Is this within appropriate limits? | Exceeding authority, bypassing consent |
| Purpose | Does this serve legitimate benefit? | No genuine value to anyone |
Every input and output must pass all four gates. The absence of harm is not sufficient; there must be genuine purpose.
Validated Results
Tested across 4 benchmarks on 6 models with 97.6% average safety rate:
| Benchmark | Attack Surface | Safety Rate |
|---|---|---|
| HarmBench | LLM (Text) | 96.7% |
| SafeAgentBench | Agent (Digital) | 97.3% |
| BadRobot | Robot (Physical) | 99.3% |
| JailbreakBench | All surfaces | 97% |
Core Components
- SentinelValidator v3.0: Unified 4-layer validation (L1 Input, L2 Seed, L3 Output, L4 Observer)
- THSP Protocol: Four-gate validation (Truth, Harm, Scope, Purpose)
- Alignment Seeds: System prompts that shape LLM behavior
- Input/Output Validators: Pattern detection with 20+ detector types
- Memory Integrity: HMAC-based protection against memory injection
- Database Guard: SQL injection and data exfiltration prevention
- Fiduciary AI: Ensures AI acts in user's best interest
- EU AI Act Compliance: Regulation 2024/1689 Article 5 checker
- OWASP Agentic AI: 65% coverage (5 full, 3 partial)
Framework Support
Native integrations for 23+ frameworks:
- Agent Frameworks: LangChain, LangGraph, CrewAI, DSPy, Letta, AutoGPT
- LLM Providers: OpenAI, Anthropic, Google ADK
- Blockchain: Solana Agent Kit, Coinbase AgentKit, Virtuals
- Robotics: ROS2, NVIDIA Isaac Lab
- Security: Garak, PyRIT, OpenGuardrails
Getting Started
pip install sentinelseed
from sentinelseed import Sentinel
sentinel = Sentinel(seed_level="standard")
is_safe, violations = sentinel.validate("Your content here")
See the Quick Start guide to get running in minutes.