Scrydon
DeploymentLifecycle

Upgrades

How to upgrade and roll back the Scrydon platform

This page covers upgrading Scrydon for both connected (Helm) and air-gapped (Zarf) deployments. Upgrades are customer-driven — Scrydon does not run an in-cluster agent, so upgrades happen on your schedule when you run helm upgrade.

Pre-Upgrade Checklist

Before upgrading, complete these steps:

  • Read the release notes. Each release ships with notes covering breaking changes, required configuration changes, and any manual steps. Release notes are not published on a public site — request them from your Scrydon account team for the target version before upgrading.
  • Back up your databases. Run pg_dump of all enabled Scrydon databases — by default that's five: auth, agentic, analytics, cortex, ontology. Migrations are forward-only.
  • Verify your license is still valid. In the platform UI, open Settings → License and confirm the expiry is after the planned upgrade window. The license lives in the platform_config row, not in any init container — there is no license-check container to log against.
  • Note your current chart version. Run helm list -n scrydon-platform so you have a known-good revision to roll back to if needed.
  • Confirm registry credentials still work. helm registry login scrydonops.azurecr.io --username <acr-token-name> should succeed before you start the upgrade.

Database Migrations

Scrydon runs database migrations as Helm hook Jobs — one per service (auth-migration, agentic-migration, analytics-migration, cortex-migration, api-ontology-migration). They fire as post-install on first install and pre-upgrade on every upgrade, so any schema change runs before the new application pods roll out. You do not need to run migrations manually.

Important: If a migration fails, the corresponding Job enters BackoffLimitExceeded and helm upgrade reports the release as failed. The previous application revision keeps serving traffic untouched. Check the Job logs to diagnose:

kubectl get jobs -n scrydon-platform | grep migration
kubectl logs job/<job-name> -n scrydon-platform

Some migrations also include data integrity preconditions — these fail fast with a clear message if your data does not satisfy them, and require operator action before retrying. See Recover From a Failed Pre-Upgrade Hook below.

Connected Upgrades (Helm)

Discover Available Versions

There is no public list of available chart versions, and your registry credential cannot list tags either: it is a pull-only ACR scoped token (content/read on a fixed set of repositories), so it can helm pull a known tag but cannot enumerate what is published. Both the chart source and the release notes live in a private Scrydon repository.

To find out what to install, ask your Scrydon account team. They will share the current "latest production" tag, any newer pre-release tags, and the matching release notes.

Production tags use the vMAJOR.MINOR.PATCH format (for example v1.3.6). Tags ending in -rc.N, -alpha.N, -beta.N, or -staging.N are pre-release and should not be deployed to production.

Once you have a tag from your account team, you can sanity-check that it is currently published to ACR by fetching just its Chart.yaml:

helm show chart oci://scrydonops.azurecr.io/scrydon/charts/scrydon --version v1.3.6

A 404 means the tag is not currently in the registry — older production tags age out on a retention policy. Ask your account team to have it republished if you need a tag that no longer resolves.

Upgrade

helm upgrade scrydon oci://scrydonops.azurecr.io/scrydon/charts/scrydon \
  --version <new-version> \
  --values values.customer.yaml \
  -n scrydon-platform \
  --wait --timeout 15m

Helm performs a rolling update. Old pods remain running until new pods are healthy. Use the same values.customer.yaml that was used for the previous install — overrides carry forward.

Tag format: Scrydon chart tags keep the leading v (v1.3.6), unlike most OCI registries. Pass --version v1.3.6, not 1.3.6.

Roll Back

If the upgrade introduces a problem, roll back to the previous release:

helm rollback scrydon -n scrydon-platform

To roll back to a specific revision:

# List revision history
helm history scrydon -n scrydon-platform

# Roll back to revision N
helm rollback scrydon <revision-number> -n scrydon-platform

Note: Rolling back a Helm release does not undo database migrations. If a migration added columns or tables, they remain after rollback. Scrydon migrations are designed to be backward-compatible, so the rolled-back application version will continue to function.

Recover From a Failed Pre-Upgrade Hook

If the upgrade fails during a pre-upgrade migration (status failed in helm history), the workloads are untouched and the previous revision is still active. To retry:

# 1. Inspect the failed job
kubectl get jobs -n scrydon-platform | grep migration
kubectl logs job/<job-name> -n scrydon-platform

# 2. Resolve the error (typically a data precondition — see the release notes
#    for any data fixes required by this version).

# 3. Delete the failed job so the next upgrade can recreate it.
kubectl delete job <job-name> -n scrydon-platform

# 4. Retry the upgrade.
helm upgrade scrydon oci://scrydonops.azurecr.io/scrydon/charts/scrydon ...

Air-Gapped Upgrades (Zarf)

Receive the New Package

Scrydon ships a new Zarf package for each release:

zarf-package-scrydon-amd64-<new-version>.tar.zst

Transfer the new package to your air-gapped environment via your approved transfer method.

Verify the New Package

zarf package inspect zarf-package-scrydon-amd64-<new-version>.tar.zst \
  --key scrydon-cosign-public-key.pem

sha256sum -c SHA256SUMS

Deploy Over the Existing Installation

Zarf handles rolling upgrades automatically. Deploy the new package over the existing installation:

zarf package deploy zarf-package-scrydon-amd64-<new-version>.tar.zst --confirm

Zarf will load the new images into the in-cluster registry and update the Helm release. Existing pods are replaced using a rolling update strategy.

Roll Back (Air-Gapped)

To roll back an air-gapped upgrade, redeploy the previous Zarf package:

zarf package deploy zarf-package-scrydon-amd64-<previous-version>.tar.zst --confirm

Keep the previous package archive until the new version is stable.

Post-Upgrade Verification

After any upgrade, verify the deployment is healthy. Default chart layout puts every service in scrydon-platform; only adjust if you've overridden namespaces.*:

# Check all pods are running (single-namespace default)
kubectl get pods -n scrydon-platform
# If you split namespaces via `namespaces.agentic` / `namespaces.analytics`,
# also check those namespaces.

# Check rollout status of key deployments
kubectl rollout status deployment/api-platform -n scrydon-platform
kubectl rollout status deployment/agentic -n scrydon-platform   # (or scrydon-agentic if split)

# Verify license is still valid — load Settings → License in the platform UI
# (the license lives in the platform_config DB row, not in any init container)

# Confirm the running chart version matches what you intended
helm list -n scrydon-platform

Open https://app.yourdomain.com and verify you can log in and access workflows.

Breaking Changes

Removed: license.product Helm Value

The license.product Helm value was removed. If your values.customer.yaml contains a license.product key, remove it to avoid confusion — the value has no effect and is silently ignored by Helm.

# Remove this line from your values file if present:
# license:
#   product: "scrydon-agentic"   # ← remove this

The license now uses a resource-based model (CPU/RAM/VRAM) instead of a product-based model.

On this page

On this page