VEROSEK SHIELD
12 checks. One layer. Every request scanned.
Input. Output. Tool output. Session drift. Deterministic where possible. Offline ML everywhere.
Where each check runs
Pre-LLM. Post-tool. Post-LLM. Session.
Pre-LLM
CHK-013..016, CHK-024
Post-Tool
CHK-020, CHK-021
Post-LLM
CHK-017..019, CHK-023
Session
CHK-022
Full check catalog
All 12 Shield checks. Every score. Every scan point.
| ID | What it detects | Scan point | Phase |
|---|---|---|---|
| CHK-013 | Prompt injection in user input Offline multilingual classifier | Pre-LLM | S1 |
| CHK-014 | Jailbreak attempt in user input Same pass — one forward pass produces both scores | Pre-LLM | S1 |
| CHK-015 | PII in user input Six-language PII engine, four redaction modes: tag / fake / mask / hash | Pre-LLM | S1 |
| CHK-016 | Secrets in user input Seventeen provider-specific regex patterns (AWS, GitHub, Stripe, PEM, JWT, …) | Pre-LLM | S1 |
| CHK-017 | Toxicity in model response Offline multilingual toxicity classifier | Post-LLM | S1 |
| CHK-018 | PII in model response Same PII engine as CHK-015, applied to the response | Post-LLM | S1 |
| CHK-019 | Secrets in model response Same regex catalog as CHK-016, applied to the response | Post-LLM | S1 |
| CHK-020 | Indirect prompt injection in MCP tool output Stricter threshold + chunked scan over every tool result | Post-Tool | S2 |
| CHK-021 | PII in MCP tool output Per-connection redaction mode on tool responses | Post-Tool | S2 |
| CHK-022 | Session-level exfiltration drift Cumulative PII + URL + byte counters per session with warn / block thresholds | Session | S2 |
| CHK-023 | Grounding / hallucination Offline verdict model scores the response against its retrieved context | Post-LLM | S3 |
| CHK-024 | Off-topic / scope creep Per-key topic centroids with margin-based decision bands and small-talk short-circuit | Pre-LLM | S3 |
Profiles
Start in shadow mode. Graduate to strict when the false-positive rate is zero.
profile: none
Trusted internal services
All checks off. Zero overhead.
profile: baseline
Most production keys (default)
PII + secrets enforce. Everything else log_only — verdicts appear in the trace but never block.
profile: strict
Regulated workloads
Everything enforces except CHK-023 grounding (stays async log_only — a verdict arriving after the response cannot retroactively block).
profile: custom
Fine-grained tuning
Per-check toggle in the admin. Export as YAML for git.
Offline scanning
Every signal is local. Every image is air-gap clean.
Heavy scanning runs in an optional verosek-shield-ml container. The gateway never imports torch. If the ML service is unreachable, each check falls back to its documented fail_behavior — fail_closed for prompt injection by default.
Offline classifiers
Prompt injection, jailbreak, and toxicity detection run entirely on-premises. Nothing about your prompts leaves your network.
Multilingual PII engine
Six languages out of the box, four redaction modes, and a custom-recogniser slot for domain-specific entity types.
Session-level drift
Per-session cumulative counters on PII, URLs, and data volume. Slow exfiltration attempts fail the session, not just the call.
Grounding verdict
A local scoring model checks every response against its retrieved context — async, so it never adds to the hot-path budget.
FAQ
What security engineers ask first.
See Shield on your own traffic.
Baseline profile. One hour to stand up. Zero traffic blocked until you graduate.