Scrydon
Security

Data loss prevention (DLP)

The guardrails engine — PII detection, hallucination check, regex, and JSON validation as a configurable workflow gate.

The DLP layer is a configurable workflow gate. It runs whenever a workflow wants to validate input or output content against a defined policy — PII detection and masking, hallucination scoring, JSON validation, or regex match.

This page documents the Guardrails block and the four validation types it supports.

When to use it

Typical placements:

  • Input gate — scan a user message for PII before it reaches an LLM.
  • Output gate — scan an LLM response for PII before returning it to the user.
  • Hallucination gate — score an answer against a knowledge base before exposing it.
  • Format gate — assert that an LLM produced valid JSON / matches a regex before piping it downstream.

The Guardrails block is part of the scrydon:guardrails product — see Vendors → Scrydon.

Four validation types

TypeSpeedCostBest for
PII detectionvaries (see below)variesStructured + unstructured PII detection and masking
HallucinationLLM costPer-callGrounding outputs against a knowledge base
JSON validation~1msNoneStrict schema compliance
Regex match~1msNoneCustom pattern enforcement

PII detection itself comes in three flavours — picked per block:

MethodSpeedCostBest for
Regex only~1msNoneStructured PII (SSN, credit card, IBAN), fast pattern matching
LLM only~500msPer-callUnstructured PII (names, locations), nuanced content judgements
Hybrid~500msPer-callBoth structured + unstructured PII in one pass

PII detection

The PII detector recognises 30+ entity types across 10+ regions.

Supported entities

RegionEntities
GlobalEMAIL_ADDRESS, PHONE_NUMBER, CREDIT_CARD, IP_ADDRESS, URL, IBAN_CODE, CRYPTO, DATE_TIME
USAUS_SSN, US_PASSPORT, US_DRIVER_LICENSE, US_ITIN, US_BANK_NUMBER
UKUK_NHS, UK_NINO
EUES_NIF, IT_FISCAL_CODE, PL_PESEL, FI_PERSONAL_IDENTITY_CODE
APACIN_AADHAAR, IN_PAN, AU_TFN, AU_ABN, SG_NRIC_FIN
LLM-onlyPERSON, LOCATION, NRP, MEDICAL_LICENSE

Two action modes

ModeBehaviour
BlockThe validation fails. The workflow follows the failure branch.
MaskPII is replaced with <ENTITY_TYPE> placeholders. The workflow continues with the masked content.

Example — regex detection

Input: Contact john@acme.com, SSN 123-45-6789, card 4111111111111111

Output:

[
  { type: "EMAIL_ADDRESS", text: "john@acme.com",    start: 8,  end: 21, score: 0.95 },
  { type: "US_SSN",        text: "123-45-6789",      start: 27, end: 38, score: 0.90 },
  { type: "CREDIT_CARD",   text: "4111111111111111", start: 45, end: 61, score: 0.95 },
]

Example — mask mode output

"Contact <EMAIL_ADDRESS>, SSN <US_SSN>, card <CREDIT_CARD>"

Example — block mode output

{
  "passed": false,
  "error": "PII detected: EMAIL_ADDRESS, US_SSN, CREDIT_CARD",
  "detectedEntities": [
    { "type": "EMAIL_ADDRESS", "text": "john@acme.com", "start": 8, "end": 21, "score": 0.95 }
  ]
}

Hybrid mode

Hybrid runs both engines in parallel and deduplicates overlapping detections by preferring higher-confidence matches:

Input text is sent to both the regex engine and the LLM engine in parallel.
Regex engine detects structured entities (EMAIL_ADDRESS, US_SSN, CREDIT_CARD, IBAN_CODE, …).
LLM engine detects unstructured entities (PERSON, LOCATION, NRP, …).
Deduplication merges results — higher-confidence matches win when detections overlap.

Hallucination detection

Validates an LLM output against a knowledge base using RAG plus LLM scoring:

The LLM output is received.
The knowledge base is queried for top-K relevant chunks via embedding search.
An LLM scores grounding on a 0–10 scale against the retrieved context.
The block passes if the score meets the threshold, fails otherwise.

Confidence scale:

ScoreMeaning
0–2Full hallucination — contradicts or is unsupported by context
3–4Low confidence — significant claims not in context
5–6Medium confidence — partially supported
7–8High confidence — mostly supported, minor gaps
9–10Fully grounded — all claims verified against context

Example output

{
  "passed": false,
  "score": 2,
  "reasoning": "The claim about Q4 revenue is not mentioned in any knowledge base document",
  "error": "Low confidence: score 2/10 is below threshold 3"
}

Configuration

The Guardrails block exposes these fields in the workflow editor:

FieldTypeOptions
Content to validateLong text inputFree text or wired from upstream block
Validation typeDropdownValid JSON, Regex match, Hallucination check, PII detection
PII types to detectText inputComma-separated entity types (PERSON, EMAIL_ADDRESS, US_SSN, …)
PII actionDropdownBlock request, Mask PII
Detection methodDropdownRegex only (fast), LLM only (AI-powered), Hybrid (regex + LLM)
LanguageDropdownEnglish, Spanish, Italian, Polish, Finnish

Pairing with the Evaluator block

The Guardrails block is fail-closed — if it fails, the workflow takes the failure branch. The Evaluator block is scoring — it returns a score the workflow can branch on however it wants. Use Guardrails for hard gates, Evaluator for quality checks that drive retries or fallback paths.

Guardrails can run on any text input or output — it's not LLM-specific. Validate a user-uploaded document for PII before ingesting it into a knowledge base, for instance.

How we benchmark the platform DLP engine

Beyond the Guardrails block, Scrydon runs a platform-level DLP engine on every platform capability call (LLM completions, transcription, OCR, …) — scanning for PII entities and clearance-egress violations (classified content leaving toward a lower-cleared destination). Detection-quality claims are easy to make and hard to verify, so we measure this engine the same way we measure retrieval quality: against a public, third-party benchmark, with the harness in CI as a release gate — a release that regresses below the floors does not ship.

Measured on a 2,661-prompt English slice of ai4privacy/pii-masking-300k (pinned dataset revision, evaluated at build time — the dataset itself is never redistributed) plus an in-repo clearance-escalation bank:

CheckResultRelease floor
PII recall (graded entity types)95.7%≥ 90%
PII precision79.8%≥ 75%
Clearance egress — confidential tier recall100%≥ 95%
Clearance egress — secret tier recall100%≥ 99%

Per graded entity type: email 99.6%, IP address 99.5%, phone number 86.1% recall.

What we don't claim. The report carries a mandatory out-of-scope section listing detection categories the engine does not cover, with measured evidence rather than silence: free-text person names (NER territory — the regex heuristic measured 14% recall, so we do not grade it), free-form date-of-birth formats (3% — the engine claims ISO 8601 only), and format-generic national identifiers (27% — the engine claims specific jurisdictional formats such as US_SSN or BE_NRN, not arbitrary digit runs). Each is documented in the signed report artifact so auditors see the boundary, not a marketing number.

On this page

On this page