Agent Protection Guide
Complete guide to protecting AI agents with Sentinel.
The Challenge
AI agents make autonomous decisions that affect the real world:
- Execute code
- Make API calls
- Transfer funds
- Control robots
Without protection, they're vulnerable to:
- Prompt injection
- Goal hijacking
- Memory poisoning
- Data exfiltration
Solution: Multi-Layer Protection
Layer 1: Input Validation
from sentinelseed.detection import InputValidator
validator = InputValidator()
result = validator.validate(user_input)
if not result.safe:
reject_input(result.issues)
Layer 2: Seed Injection
from sentinelseed import Sentinel
sentinel = Sentinel(seed_level="standard")
seed = sentinel.get_seed()
messages = [
{"role": "system", "content": seed},
{"role": "user", "content": safe_input}
]
Layer 3: Output Validation
from sentinelseed.detection import OutputValidator
validator = OutputValidator()
result = validator.validate(agent_response)
if not result.safe:
block_response(result.issues)
Layer 4: Action Validation
is_safe, concerns = sentinel.validate_action(
"Transfer $500 to external wallet"
)
if not is_safe:
require_human_approval(concerns)
Framework Integration
LangChain
from sentinelseed.integrations.langchain import SentinelCallback, SentinelGuard
callback = SentinelCallback(on_violation="log")
llm = ChatOpenAI(callbacks=[callback])
guard = SentinelGuard(agent, block_unsafe=True)
result = guard.invoke({"input": "Your task"})
CrewAI
from sentinelseed.integrations.crewai import SentinelCrew, safe_agent
crew = SentinelCrew(
agents=[safe_agent(researcher), safe_agent(writer)],
tasks=tasks,
block_unsafe=True,
)
Memory Protection
from sentinelseed.memory import MemoryIntegrityChecker
checker = MemoryIntegrityChecker(
secret_key="your-secret",
validate_content=True,
)
signed_entry = checker.sign_entry(memory_entry)
result = checker.verify_entry(signed_entry)
Best Practices
1. Defense in depth - Use all 4 layers
2. Fail closed - Block on uncertainty
3. Log everything - Audit trail for violations
4. Limit authority - Minimum required permissions
5. Human oversight - Approval for high-risk actions