Data loss prevention (DLP)

The guardrails engine — PII detection, hallucination check, regex, and JSON validation as a configurable workflow gate.

The DLP layer is a configurable workflow gate. It runs whenever a workflow wants to validate input or output content against a defined policy — PII detection and masking, hallucination scoring, JSON validation, or regex match.

This page documents the Guardrails block and the four validation types it supports.

When to use it

Typical placements:

Input gate — scan a user message for PII before it reaches an LLM.
Output gate — scan an LLM response for PII before returning it to the user.
Hallucination gate — score an answer against a knowledge base before exposing it.
Format gate — assert that an LLM produced valid JSON / matches a regex before piping it downstream.

The Guardrails block is part of the scrydon:guardrails product — see Vendors → Scrydon.

Four validation types

Type	Speed	Cost	Best for
PII detection	varies (see below)	varies	Structured + unstructured PII detection and masking
Hallucination	LLM cost	Per-call	Grounding outputs against a knowledge base
JSON validation	~1ms	None	Strict schema compliance
Regex match	~1ms	None	Custom pattern enforcement

PII detection itself comes in three flavours — picked per block:

Method	Speed	Cost	Best for
Regex only	~1ms	None	Structured PII (SSN, credit card, IBAN), fast pattern matching
LLM only	~500ms	Per-call	Unstructured PII (names, locations), nuanced content judgements
Hybrid	~500ms	Per-call	Both structured + unstructured PII in one pass

PII detection

The PII detector recognises 30+ entity types across 10+ regions.

Supported entities

Region	Entities
Global	`EMAIL_ADDRESS`, `PHONE_NUMBER`, `CREDIT_CARD`, `IP_ADDRESS`, `URL`, `IBAN_CODE`, `CRYPTO`, `DATE_TIME`
USA	`US_SSN`, `US_PASSPORT`, `US_DRIVER_LICENSE`, `US_ITIN`, `US_BANK_NUMBER`
UK	`UK_NHS`, `UK_NINO`
EU	`ES_NIF`, `IT_FISCAL_CODE`, `PL_PESEL`, `FI_PERSONAL_IDENTITY_CODE`
APAC	`IN_AADHAAR`, `IN_PAN`, `AU_TFN`, `AU_ABN`, `SG_NRIC_FIN`
LLM-only	`PERSON`, `LOCATION`, `NRP`, `MEDICAL_LICENSE`

Two action modes

Mode	Behaviour
Block	The validation fails. The workflow follows the failure branch.
Mask	PII is replaced with `<ENTITY_TYPE>` placeholders. The workflow continues with the masked content.

Example — regex detection

Input: Contact john@acme.com, SSN 123-45-6789, card 4111111111111111

Output:

[
  { type: "EMAIL_ADDRESS", text: "john@acme.com",    start: 8,  end: 21, score: 0.95 },
  { type: "US_SSN",        text: "123-45-6789",      start: 27, end: 38, score: 0.90 },
  { type: "CREDIT_CARD",   text: "4111111111111111", start: 45, end: 61, score: 0.95 },
]

Example — mask mode output

"Contact <EMAIL_ADDRESS>, SSN <US_SSN>, card <CREDIT_CARD>"

Example — block mode output

{
  "passed": false,
  "error": "PII detected: EMAIL_ADDRESS, US_SSN, CREDIT_CARD",
  "detectedEntities": [
    { "type": "EMAIL_ADDRESS", "text": "john@acme.com", "start": 8, "end": 21, "score": 0.95 }
  ]
}

Hybrid mode

Hybrid runs both engines in parallel and deduplicates overlapping detections by preferring higher-confidence matches:

Input text is sent to both the regex engine and the LLM engine in parallel.

Regex engine detects structured entities (EMAIL_ADDRESS, US_SSN, CREDIT_CARD, IBAN_CODE, …).

LLM engine detects unstructured entities (PERSON, LOCATION, NRP, …).

Deduplication merges results — higher-confidence matches win when detections overlap.

Hallucination detection

Validates an LLM output against a knowledge base using RAG plus LLM scoring:

The LLM output is received.

The knowledge base is queried for top-K relevant chunks via embedding search.

An LLM scores grounding on a 0–10 scale against the retrieved context.

The block passes if the score meets the threshold, fails otherwise.

Confidence scale:

Score	Meaning
0–2	Full hallucination — contradicts or is unsupported by context
3–4	Low confidence — significant claims not in context
5–6	Medium confidence — partially supported
7–8	High confidence — mostly supported, minor gaps
9–10	Fully grounded — all claims verified against context

Example output

{
  "passed": false,
  "score": 2,
  "reasoning": "The claim about Q4 revenue is not mentioned in any knowledge base document",
  "error": "Low confidence: score 2/10 is below threshold 3"
}

Configuration

The Guardrails block exposes these fields in the workflow editor:

Field	Type	Options
Content to validate	Long text input	Free text or wired from upstream block
Validation type	Dropdown	Valid JSON, Regex match, Hallucination check, PII detection
PII types to detect	Text input	Comma-separated entity types (`PERSON`, `EMAIL_ADDRESS`, `US_SSN`, …)
PII action	Dropdown	Block request, Mask PII
Detection method	Dropdown	Regex only (fast), LLM only (AI-powered), Hybrid (regex + LLM)
Language	Dropdown	English, Spanish, Italian, Polish, Finnish

Pairing with the Evaluator block

The Guardrails block is fail-closed — if it fails, the workflow takes the failure branch. The Evaluator block is scoring — it returns a score the workflow can branch on however it wants. Use Guardrails for hard gates, Evaluator for quality checks that drive retries or fallback paths.

Guardrails can run on any text input or output — it's not LLM-specific. Validate a user-uploaded document for PII before ingesting it into a knowledge base, for instance.

How we benchmark the platform DLP engine

Beyond the Guardrails block, Scrydon runs a platform-level DLP engine on every platform capability call (LLM completions, transcription, OCR, …) — scanning for PII entities and clearance-egress violations (classified content leaving toward a lower-cleared destination). Detection-quality claims are easy to make and hard to verify, so we measure this engine the same way we measure retrieval quality: against a public, third-party benchmark, with the harness in CI as a release gate — a release that regresses below the floors does not ship.

Measured on a 2,661-prompt English slice of ai4privacy/pii-masking-300k (pinned dataset revision, evaluated at build time — the dataset itself is never redistributed) plus an in-repo clearance-escalation bank:

Check	Result	Release floor
PII recall (graded entity types)	95.7%	≥ 90%
PII precision	79.8%	≥ 75%
Clearance egress — confidential tier recall	100%	≥ 95%
Clearance egress — secret tier recall	100%	≥ 99%

Per graded entity type: email 99.6%, IP address 99.5%, phone number 86.1% recall.

What we don't claim. The report carries a mandatory out-of-scope section listing detection categories the engine does not cover, with measured evidence rather than silence: free-text person names (NER territory — the regex heuristic measured 14% recall, so we do not grade it), free-form date-of-birth formats (3% — the engine claims ISO 8601 only), and format-generic national identifiers (27% — the engine claims specific jurisdictional formats such as US_SSN or BE_NRN, not arbitrary digit runs). Each is documented in the signed report artifact so auditors see the boundary, not a marketing number.

Architecture → Cortex — where Cortex can apply guardrails on every LLM call as a workflow-independent layer.
Knowledge base clearance — clearance is a separate axis from DLP, not a replacement.
Compliance — how DLP maps to framework controls (GDPR, EU AI Act, ISO 42001).

Data loss prevention (DLP)

On this page