Upgrade runbook
The order of operations for an in-place Scrydon upgrade.
This runbook covers an in-place upgrade of a running Scrydon cluster. For major-version upgrades that include breaking changes, follow the version's release notes in addition to this runbook.
Pre-upgrade
- Read the release notes. Note any breaking changes, required Helm values changes, or schema migrations that need manual steps.
- Verify the artefacts. Run
cosign verifyagainst the new images and chart. See Supply-chain verification. - Take a PostgreSQL snapshot. If anything goes wrong, this is your rollback path. See Backup & restore.
- Check the license. Confirm your license has at least 90 days remaining via Settings → License in the platform UI. If not, rotate it first. See License rotation.
- Confirm the audit forwarder is healthy. Upgrades emit a flood of events; you want them captured.
Upgrade command
Scrydon ships as a single Helm chart (oci://scrydonops.azurecr.io/scrydon/charts/scrydon). One helm upgrade rolls every service. The chart's per-service pre-upgrade hook Jobs handle migration ordering (db-ensure-databases at weight 1, auth-migration at weight 5, every other migration Job at weight 6) — operators do not have to sequence sub-charts by hand.
helm upgrade scrydon oci://scrydonops.azurecr.io/scrydon/charts/scrydon \
--version <new-version> \
--namespace scrydon-platform \
--values values.customer.yaml \
--wait --timeout 15m
# Verify rollout
kubectl rollout status deployment/api-platform -n scrydon-platform
kubectl rollout status deployment/agentic -n scrydon-platform # (or scrydon-agentic if you split namespaces)
kubectl rollout status deployment/analytics -n scrydon-platform # if analytics.enabledIf you've explicitly split namespaces via
namespaces.*overrides, point eachkubectl rollout statusat the matching namespace. The default chart layout keeps everything inscrydon-platform.
Watch the migration
Migrations run as Helm pre-upgrade hook Jobs (auth-migration-<rev>, agentic-migration-<rev>, analytics-migration-<rev>, cortex-migration-<rev>, api-ontology-migration-<rev>). They are BackoffLimit: 3 and write structured logs.
kubectl get jobs -n scrydon-platform | grep migration
kubectl logs job/auth-migration-<rev> -n scrydon-platformThe migration tracking table inside each database is drizzle.__drizzle_migrations:
kubectl exec -it deploy/db -n scrydon-platform -- \
psql -U postgres -d auth \
-c "SELECT id, hash, created_at FROM drizzle.__drizzle_migrations ORDER BY id DESC LIMIT 5;"If a migration fails, see Database migrations → Handling a migration failure.
Post-upgrade verification
After all rollouts complete, verify the platform is healthy:
- Sign in. Confirm SSO still works.
- Open a workflow. Confirm the editor loads and the integration list is correct.
- Run a known-good workflow. Confirm the run completes with the same output as before.
- Check the audit log. Confirm new events are landing.
- Check the SIEM. Confirm forwarding caught up after the rollout.
- Check metrics. Compare the post-upgrade dashboards to pre-upgrade — significant changes need investigation.
Rollback
Rolling back is forward-only safe for the application, but schema migrations are not down-migrated. If the upgrade applied a migration, rolling the application back is necessary but not sufficient — you may need to restore the database snapshot you took pre-upgrade.
For a clean rollback:
# Roll the single Scrydon release back to the previous revision
helm rollback scrydon <previous-revision> -n scrydon-platform
# Confirm the revision moved
helm list -n scrydon-platformIf migrations changed schema in a way the previous app version can't read, restore from the pre-upgrade PostgreSQL snapshot.
Always validate the rollback with the same post-upgrade verification steps before declaring the rollback complete.
Related
- Upgrades — the higher-level upgrades page.
- Database migrations — migration internals.
- Backup & restore — the disaster-recovery path.