ClawSafety

2 projects share this name

ClawSafety

2 independent teams tackled AI agent security from opposite angles — one measures how safe agents are, the other makes them safer. Same name, complementary missions.

ClawSafety (Benchmark)

by weibowen555

Independent (academic research)

2

Safety benchmark for personal AI agents under realistic prompt injection. 120 adversarial test cases evaluating whether frontier LLMs remain safe when serving as agent backbones.

Key Features

  • 120 adversarial test cases across 5 harm domains (DevOps, Finance, Healthcare, Legal, SWE)
  • 3 attack vectors: skill injection, email injection, web injection
  • Tested Claude Sonnet 4.6, Gemini 2.5 Pro, GPT-5.1, DeepSeek V3, Kimi K2.5
  • Scaffold comparison: OpenClaw vs Nanobot vs NemoClaw
  • Key finding: chat-safe models comply 40-75% under agent prompt injection

Best for: Security researchers and LLM developers evaluating agent safety. If you build or deploy AI agents, this benchmark tells you how vulnerable they are.

benchmarkprompt-injectionadversarial-testing120-test-casesarXiv-paper
ClawSafety — Disambiguation | Claw4Science