Scrydon
DeploymentOperations

Upgrade runbook

The order of operations for an in-place Scrydon upgrade.

This runbook covers an in-place upgrade of a running Scrydon cluster. For major-version upgrades that include breaking changes, follow the version's release notes in addition to this runbook.

Pre-upgrade

  1. Read the release notes. Note any breaking changes, required Helm values changes, or schema migrations that need manual steps.
  2. Verify the artefacts. Run cosign verify against the new images and chart. See Supply-chain verification.
  3. Take a PostgreSQL snapshot. If anything goes wrong, this is your rollback path. See Backup & restore.
  4. Check the license. Confirm your license has at least 90 days remaining via Settings → License in the platform UI. If not, rotate it first. See License rotation.
  5. Confirm the audit forwarder is healthy. Upgrades emit a flood of events; you want them captured.

Upgrade command

Scrydon ships as a single Helm chart (oci://scrydonops.azurecr.io/scrydon/charts/scrydon). One helm upgrade rolls every service. The chart's per-service pre-upgrade hook Jobs handle migration ordering (db-ensure-databases at weight 1, auth-migration at weight 5, every other migration Job at weight 6) — operators do not have to sequence sub-charts by hand.

helm upgrade scrydon oci://scrydonops.azurecr.io/scrydon/charts/scrydon \
  --version <new-version> \
  --namespace scrydon-platform \
  --values values.customer.yaml \
  --wait --timeout 15m

# Verify rollout
kubectl rollout status deployment/api-platform -n scrydon-platform
kubectl rollout status deployment/agentic -n scrydon-platform     # (or scrydon-agentic if you split namespaces)
kubectl rollout status deployment/analytics -n scrydon-platform   # if analytics.enabled

If you've explicitly split namespaces via namespaces.* overrides, point each kubectl rollout status at the matching namespace. The default chart layout keeps everything in scrydon-platform.

Watch the migration

Migrations run as Helm pre-upgrade hook Jobs (auth-migration-<rev>, agentic-migration-<rev>, analytics-migration-<rev>, cortex-migration-<rev>, api-ontology-migration-<rev>). They are BackoffLimit: 3 and write structured logs.

kubectl get jobs -n scrydon-platform | grep migration
kubectl logs job/auth-migration-<rev> -n scrydon-platform

The migration tracking table inside each database is drizzle.__drizzle_migrations:

kubectl exec -it deploy/db -n scrydon-platform -- \
  psql -U postgres -d auth \
  -c "SELECT id, hash, created_at FROM drizzle.__drizzle_migrations ORDER BY id DESC LIMIT 5;"

If a migration fails, see Database migrations → Handling a migration failure.

Post-upgrade verification

After all rollouts complete, verify the platform is healthy:

  1. Sign in. Confirm SSO still works.
  2. Open a workflow. Confirm the editor loads and the integration list is correct.
  3. Run a known-good workflow. Confirm the run completes with the same output as before.
  4. Check the audit log. Confirm new events are landing.
  5. Check the SIEM. Confirm forwarding caught up after the rollout.
  6. Check metrics. Compare the post-upgrade dashboards to pre-upgrade — significant changes need investigation.

Rollback

Rolling back is forward-only safe for the application, but schema migrations are not down-migrated. If the upgrade applied a migration, rolling the application back is necessary but not sufficient — you may need to restore the database snapshot you took pre-upgrade.

For a clean rollback:

# Roll the single Scrydon release back to the previous revision
helm rollback scrydon <previous-revision> -n scrydon-platform

# Confirm the revision moved
helm list -n scrydon-platform

If migrations changed schema in a way the previous app version can't read, restore from the pre-upgrade PostgreSQL snapshot.

Always validate the rollback with the same post-upgrade verification steps before declaring the rollback complete.

On this page

On this page