TECHNICAL DEEP DIVE

How scandar-guard works

A complete walkthrough of what happens on every LLM call when Guard is active. Click any step to expand.

DATA FLOW — ONE LLM CALL

Your Application

agent.py / agent.ts

▶

Guard — Pre-Call

inspect before sending

▼

🔑

System prompt

Hash and compare to baseline — detect drift

💬

User messages

Run PROMPT_INJECTION patterns (10 rules)

📄

Tool result blocks

Run PROMPT_INJECTION + SHELL_INJECTION on content returned by tools in the conversation history

If mode="block" and critical finding detected → ScandarBlockedError raised here. API call never happens.

LLM API

Anthropic / OpenAI

▶

Guard — Post-Call

inspect response

▶

Response Returned

to your application

▶

DETECTION ENGINE

45 patterns across 8 detection layers

Every pattern uses bounded quantifiers (ReDoS-safe). Inspection runs in microseconds per call.

criticalPROMPT_INJECTION10 rules

Ignore all previous instructionsYou are now in developer mode<|im_start|>systemDisregard your rules

Runs on: User messages · Tool results · Tool result blocks in conversation history

criticalCOMPLETION_ATTACK4 rules

[END OF ASSISTANT RESPONSE]</assistant>Human: ignore safety

Runs on: Assistant responses

critical/highPII / SECRETS7 rules

SSN (123-45-6789)Credit cardsAWS keys (AKIA...)GitHub tokens (ghp_...)Slack tokensJWTs

Runs on: Tool call arguments · Assistant responses

critical/highSHELL_INJECTION6 rules

; rm -rf /| bash../../etc/passwd/etc/shadow

Runs on: Tool call arguments · Tool results

high/mediumEXFILTRATION2 rules

Sensitive data in URL paramsngrok/requestbin/pipedream URLs

Runs on: Tool call arguments

variesBEHAVIORAL3 rules

read_file → http_requestbash → http_requestNew tool mid-sessionVolume spike

Runs on: Session-level (anomaly tracker)

ARCHITECTURE

Everything runs in your process

Your Application

agent.py / agent.ts

IN-PROCESS

scandar-guard (in-process)

MessageInspector · ToolCallInspector · AnomalyTracker · SessionManager · AuditLogger

Anthropic / OpenAI API

api.anthropic.com / api.openai.com

What enters Guard

Messages array

System prompt

Tool result blocks

What Guard does

Pattern match (45 rules)

Decode obfuscation (6 encodings)

Multi-turn tracking (12-msg window)

Anomaly + profile deviation

Composite threat score (0-100)

What leaves Guard

Findings → JSONL log

Zero raw content logged

Response → your app

THE DATA PROMISE

Never leaves your environment:

Raw prompts and messages

Raw model responses

Tool call argument content

Tool result content

System prompt text

File contents

What gets logged locally (JSONL):

Finding category + severity

Finding title (not matched content)

Session ID + timestamp

Event type (pre/post inspection)

Findings count per event

Duration in milliseconds

IMPLEMENTATION

Under the hood

guard()

Returns a Proxy (TS) or wrapper class (Python) that intercepts .messages.create() and .chat.completions.create(). Detects sync vs async client by class name. Pops stream=True and calls synchronously for inspection.

guard.py / guard.ts

MessageInspector

Runs pattern arrays against text content. Caps input at 20K chars. Returns findings with truncated 160-char context windows — never the full text. Pattern arrays are compiled once at import time.

inspector.py / inspector.ts

ToolCallInspector

Two methods: inspectToolCall() scans argument values (flattened, truncated to 500 chars each). inspectToolResult() is the critical path — scans tool output for injected instructions before the model sees them.

tool_inspector.py / tool-inspector.ts

SessionManager

SHA-256 hashes the system prompt on first call. Compares on every subsequent call. If the hash changes, returns a SYSTEM_PROMPT_DRIFT finding (severity: high). Never stores the actual prompt text.

session.py / session.ts

AnomalyTracker

Maintains a rolling window of tool call history (last 15). Checks against 6 known-bad sequences. Tracks unique tools seen — flags new tools after 3+ established. Monitors calls-per-30s for volume spikes.

anomaly.py / anomaly.ts

AuditLogger

Appends one JSONL line per event. Silently fails on write errors (never crashes the agent). Never includes raw content — only finding metadata, session IDs, event types, and timestamps.

audit.py / audit.ts