BYO Database
Point the Scrydon platform at a managed Postgres service (RDS, Cloud SQL, Azure Database for PostgreSQL, Supabase, Neon) instead of the bundled in-cluster Postgres.
Scope: replacing the in-cluster Postgres deployed by
infra.dbwith a customer-managed instance. Covers AWS RDS, GCP Cloud SQL, Azure Database for PostgreSQL (Flexible Server), Supabase, Neon, CloudNativePG, the Zalando Postgres Operator, or any other Postgres ≥ 15 withpgvector.Related: Backup & Restore (the bundled backup CronJob is scoped to the in-cluster DB only — managed-DB users should rely on the provider's native PITR).
Why
The bundled infra.db is a single-pod Postgres backed by a PVC — fine for proofs of concept and single-node clusters, but it lacks point-in-time recovery, synchronous replication, automated failover, and regional snapshot replication. Production customers and most compliance audits (SOC 2, ISO 27001 A.8.13, SecNumCloud 3.2, NIST CP-9/CP-10) expect a managed service or a dedicated DBA-operated cluster.
Prerequisites
| Requirement | Notes |
|---|---|
| Postgres 15 or newer | Drizzle migrations use generated columns and jsonb ops introduced in 15. |
pgvector extension available | RDS ≥ 15.3 ships it; Azure Flexible Server ≥ 15 requires azure.extensions=VECTOR; Cloud SQL 15+ has it enabled; Supabase and Neon both ship it. |
| Databases pre-created | One per enabled component. With chart defaults that is five: auth, agentic, analytics, cortex, ontology. See Pre-create databases and extension. |
| A migration role | Needs CREATE TABLE, CREATE INDEX, CREATE EXTENSION on each database. On RDS this usually requires the rds_superuser-equivalent role; on Cloud SQL the default postgres user is sufficient. |
| TLS enforced | Managed providers terminate TLS at the server. Use sslmode=require minimum; verify-full if you can mount the provider's CA bundle. |
| Connection pool sizing | Each Scrydon app pod + Dapr sidecar opens several connections. For small-tier managed DBs (< 100 max_connections) put a PgBouncer or the provider's built-in pooler in front. |
Pre-create databases and extension
Run once, from any client that can reach your managed DB. The default chart enables every component, which maps to five databases; drop the lines for any service you've disabled in your values file.
psql "$ADMIN_URL" <<'SQL'
CREATE DATABASE auth;
CREATE DATABASE agentic;
CREATE DATABASE analytics; -- only if analytics.enabled (default true)
CREATE DATABASE cortex; -- only if cortex.enabled (default true)
CREATE DATABASE ontology; -- only if apiOntology.enabled (default true)
\c auth
CREATE EXTENSION IF NOT EXISTS vector;
\c agentic
CREATE EXTENSION IF NOT EXISTS vector;
\c cortex
CREATE EXTENSION IF NOT EXISTS vector;
SQLThe bundled chart creates these automatically via an initdb script on the in-cluster Postgres; for managed DBs you must do it yourself. analytics and ontology do not require pgvector; auth, agentic, and cortex do.
Why this must happen before
helm install: the chart runs per-service Drizzle migration Jobs aspost-installhooks againstDATABASE_URLdirectly —drizzle-kit migratedoes not create the database itself. A missing database surfaces asauth-migration/agentic-migration/cortex-migration/api-ontology-migrationJobs inCrashLoopBackOffshortly after install.
Configuration
The chart supports two modes. Pick one per app — when both are set for an app, infra.db.external.existingSecrets.<app> wins over infra.db.external.<app>Url.
Mode A — external Secrets (recommended)
Keep DSNs out of values files and CI logs. Pre-create one Secret per app, each in the app's target namespace, each carrying a single DSN under the key that app expects. The chart adds each as a second envFrom on the corresponding deployments.
1. Create one Secret per enabled component (five with chart defaults):
NS=scrydon-platform # adjust if you use split namespaces
# auth → key DATABASE_URL → auth DB
kubectl create secret generic scrydon-db-auth -n "$NS" \
--from-literal=DATABASE_URL='postgres://scrydon:REDACTED@pg.example.com:5432/auth?sslmode=require'
# agentic → key DATABASE_URL → agentic DB
kubectl create secret generic scrydon-db-agentic -n "$NS" \
--from-literal=DATABASE_URL='postgres://scrydon:REDACTED@pg.example.com:5432/agentic?sslmode=require'
# analytics → key DATABASE_URL_ANALYTICS → analytics DB
kubectl create secret generic scrydon-db-analytics -n "$NS" \
--from-literal=DATABASE_URL_ANALYTICS='postgres://scrydon:REDACTED@pg.example.com:5432/analytics?sslmode=require'
# cortex → key DATABASE_URL_CORTEX → cortex DB
kubectl create secret generic scrydon-db-cortex -n "$NS" \
--from-literal=DATABASE_URL_CORTEX='postgres://scrydon:REDACTED@pg.example.com:5432/cortex?sslmode=require'
# ontology → key DATABASE_URL_ONTOLOGY → ontology DB
kubectl create secret generic scrydon-db-ontology -n "$NS" \
--from-literal=DATABASE_URL_ONTOLOGY='postgres://scrydon:REDACTED@pg.example.com:5432/ontology?sslmode=require'Why one Secret per app: each app reads its DSN env var (
DATABASE_URL,DATABASE_URL_ANALYTICS,DATABASE_URL_CORTEX,DATABASE_URL_ONTOLOGY) viaenvFrom, which merges every key of the referenced Secret into the container env. Sharing one Secret would force apps that readDATABASE_URL(auth, agentic) to see the same value and point at the same database — wrong. Per-app Secrets keep each app pinned to its own database while allowing all apps to share a namespace.
2. Install the chart:
# values.customer.yaml
infra:
db:
enabled: false
tls:
enabled: false
backup:
enabled: false
external:
existingSecrets:
auth: scrydon-db-auth
agentic: scrydon-db-agentic
analytics: scrydon-db-analytics
cortex: scrydon-db-cortex
ontology: scrydon-db-ontologyhelm upgrade --install scrydon oci://scrydonops.azurecr.io/scrydon/charts/scrydon \
--version <version> \
--namespace scrydon-platform \
-f values.customer.yamlIf you manage secrets via External Secrets Operator, Sealed Secrets, or SOPS, make sure the resulting Secret has the correct key (DATABASE_URL or DATABASE_URL_ANALYTICS) in the right namespace — the chart never inspects how the Secret got there.
You may also mix modes per app (e.g. existingSecrets.auth for auth, agenticUrl for agentic) — Mode A wins per app when both are set.
Mode B — inline DSNs
Simpler — one values block, no pre-created Secrets. DSNs end up in the chart-managed secrets-<app> Secret inside the cluster. Fine for staging; avoid in production (DSNs sit in values files, CI logs, and helm get values output).
# values.customer.yaml
infra:
db:
enabled: false
tls:
enabled: false
backup:
enabled: false
external:
authUrl: "postgres://scrydon:REDACTED@pg.example.com:5432/auth"
agenticUrl: "postgres://scrydon:REDACTED@pg.example.com:5432/agentic"
analyticsUrl: "postgres://scrydon:REDACTED@pg.example.com:5432/analytics"
cortexUrl: "postgres://scrydon:REDACTED@pg.example.com:5432/cortex"
ontologyUrl: "postgres://scrydon:REDACTED@pg.example.com:5432/ontology"
sslMode: require # appended as ?sslmode=require to each DSNWhat still needs to stay in-cluster
| Component | Why |
|---|---|
Dapr crypto master key (dapr-crypto-key Secret) | Encrypts secret values written by the Scrydon secrets layer before they reach Postgres. If lost, encrypted secrets in a restored managed DB are unrecoverable. Back this up off-cluster — see Backup & Restore. |
| StarRocks (if enabled) | Analytics OLAP engine — has its own backup tooling; see StarRocks docs. |
| Object storage / uploads PVC | pvc-uploads is independent of infra.db and is not affected by BYO Postgres. |
Kubernetes Secrets holding app-specific values (AUTH_SECRET, API tokens, Dapr app-api tokens) | Chart-managed as today — Mode A only replaces the DATABASE_URL line. |
Migrations
The chart runs one Helm-hook Job per enabled component — auth-migration, agentic-migration, and (when enabled) analytics-migration, cortex-migration, api-ontology-migration — as post-install / pre-upgrade hooks. Each one sources its DSN from the same Secret the runtime uses, so they apply against your managed DB.
The wait-for-db init container that ping-tests the in-cluster service FQDN is skipped when infra.db.enabled=false — external DB readiness is the provider's responsibility.
To re-run all migrations manually after a values change:
kubectl delete job -l 'app in (auth-migration,agentic-migration,analytics-migration,cortex-migration,api-ontology-migration)' -n scrydon-platform
helm upgrade scrydon oci://scrydonops.azurecr.io/scrydon/charts/scrydon \
--version <version> \
--namespace scrydon-platform \
-f values.customer.yamlVerification
# Apps should be Ready with Dapr sidecars injected (2/2). The chart defaults
# every service to scrydon-platform; add your namespaces.* targets if you split.
kubectl get pods -n scrydon-platform
# No in-cluster Postgres should exist
kubectl get deploy db -n scrydon-platform
# Expected: Error from server (NotFound)
# Migration Jobs should have run successfully
kubectl get jobs -n scrydon-platform
# auth-migration-<rev>, agentic-migration-<rev>, and (when enabled)
# analytics-migration-<rev>, cortex-migration-<rev>, api-ontology-migration-<rev> — Complete
# A running app pod should have DATABASE_URL pointing at the managed host
kubectl exec deploy/api-platform -n scrydon-platform -c api-platform -- \
sh -c 'echo "$DATABASE_URL" | sed -E "s#//[^@]+@#//REDACTED@#"'Provider notes
AWS RDS / Aurora
- Parameter group: set
rds.force_ssl=1. - Extension:
pgvectoravailable from Postgres 15.3 on RDS and Aurora. Install viaCREATE EXTENSION vector;from anrds_superuser-owned role. - Role: create a least-privilege role (
CREATE ROLE scrydon LOGIN PASSWORD '…' CREATEDB;) and grant it per-DB ownership. - IAM auth: not supported by the chart today — use password auth.
GCP Cloud SQL
cloudsql.iam_authentication=onflag is optional; chart uses password auth.pgvectorenabled by default on Postgres 15+ Enterprise Plus; on standard tiers runCREATE EXTENSION vector;once.- Private IP: run the cluster in the same VPC as your Cloud SQL instance, or front it with a Cloud SQL Auth Proxy sidecar.
Azure Database for PostgreSQL (Flexible Server)
- Server parameter: add
VECTORtoazure.extensionsbefore the firstCREATE EXTENSION. - Networking: private DNS zone + VNet peering is recommended; alternatively open firewall rules for the AKS egress IP.
- CA bundle: Azure publishes a combined CA bundle. Mount it and use
sslmode=verify-full&sslrootcert=/etc/ssl/azure-ca.pemfor full verification.
Supabase / Neon
pgvectorenabled by default.- Connection string format (pooled, transaction mode):
postgres://postgres.<project>:<pass>@aws-0-<region>.pooler.supabase.com:6543/postgres. Verify your migrations run via the pooled connection; some DDL may require the direct (5432) connection. - Neon's auto-suspend will freeze idle branches — set a keepalive or use a paid tier for platforms that idle between workflow runs.
Restore onto managed Postgres
To restore a pg_dump --format=custom archive produced by the bundled backup CronJob (see Backup & Restore):
# decompress if you used gzip
gunzip -k /backups/auth-20260416T020000Z.sql.gz
# restore into the managed DB (run from any client with network access)
pg_restore -d "$DATABASE_URL" --no-owner --no-privileges \
/backups/auth-20260416T020000Z.sql--no-owner --no-privileges is important — role names on the managed DB usually differ from the bundled Postgres (postgres vs scrydon / rds_superuser / cloudsqlsuperuser).
After restore, roll the app pods so they pick up fresh connections:
kubectl rollout restart deployment -n scrydon-platformTroubleshooting
CREATE EXTENSION vector fails — the managed provider does not have pgvector on your current Postgres version / tier. Upgrade the instance; no workaround.
App pods CrashLoopBackOff with ECONNREFUSED or SSL required — DSN host/port wrong, TLS mandatory but sslmode missing, or network policy blocks egress to the DB. kubectl logs the pod, check the DSN value with the kubectl exec command under Verification.
Migrations hang — the wait-for-db init container is skipped, so the Job goes straight to migrating. If it hangs, the DSN is wrong or the role lacks CREATE privileges. Inspect kubectl logs job/auth-migration-<rev>.
Dapr workflow state table not found — the Dapr Workflows component auto-creates its schema on first start. If you are replacing the bundled DB with an existing managed DB that was used elsewhere, ensure the tablePrefix values (dapr_workflow_state, analytics_dapr_workflow_state) do not collide.
Compliance mapping
- ISO 27001:2022 A.8.13 (information backup) — managed provider's native backup satisfies this; disable
infra.db.backup.enabled. - ISO 27001:2022 A.8.24 (TLS in transit) —
sslmode=requireminimum,verify-fullpreferred. - NIST CP-9 / CP-10 (backup & restore) — managed PITR coverage; document the provider's RPO/RTO in your BCP.
- SecNumCloud 3.2 (continuity) — requires regional or multi-AZ replication, which the bundled in-cluster Postgres cannot deliver. BYO managed Postgres is the supported path.