TLS Offloading

Configure Traefik behind a TLS-terminating load balancer (App Gateway / ALB / GCP LB), keep the ingress private, and troubleshoot 502 Bad Gateway.

These settings live on your Traefik release, not in the Scrydon chart. The Scrydon chart never provisions an ingress controller; it only emits the Ingress objects your controller routes.

Behind a TLS-terminating load balancer

If an upstream load balancer terminates TLS and forwards plain HTTP to Traefik — Azure Application Gateway, AWS ALB, GCP HTTP(S) LB, F5 — Traefik must be told to trust that load balancer's source IP range. Until you do, three symptoms appear:

HTTP → HTTPS redirect loops. Traefik sees plain HTTP and redirects to HTTPS; the LB re-terminates and forwards HTTP again. The browser spins.
Wrong client IP. Per-IP rate limits and audit logs record the LB's IP instead of the real client.
Login appears to work, then the session cookie vanishes on the next click. Better-Auth reads x-forwarded-proto to compute its public scheme. With Traefik rewriting it to http, the secure cookie is dropped or the cross-origin POST is rejected — no redirect loop, no 5xx, just a silent "I'm logged out".

Two things must be true:

Traefik must see the LB's real source IP at the TCP layer. Default externalTrafficPolicy: Cluster makes kube-proxy SNAT to the node IP before Traefik receives it. Set Local to preserve the upstream source IP.
The entrypoints must list the LB's source CIDR as trusted, so Traefik honours the X-Forwarded-* headers it sends.

# traefik-values.yaml — your Traefik release, not the Scrydon chart.
service:
  spec:
    externalTrafficPolicy: Local

ports:
  web:
    forwardedHeaders:
      trustedIPs:
        - 10.0.1.0/24   # the subnet the LB forwards FROM (App Gateway, firewall, …)
  websecure:
    forwardedHeaders:
      trustedIPs:
        - 10.0.1.0/24

helm upgrade --install traefik traefik/traefik \
  --namespace traefik --create-namespace \
  -f traefik-values.yaml

trustedIPs is a list — use a values file, not flag-style --set. Inline --set works for a single value but is brittle when chained with --reuse-values on follow-up upgrades.

This is complementary to the Scrydon chart's force-https middleware (ingress.middleware.forceHttps, enabled by default). The middleware runs after Traefik's entrypoint logic — it does not on its own stop the redirect loop or restore the client IP.

Diagnostic shortcut

If you suspect this is the problem but don't know the right CIDR, temporarily trust everything:

helm upgrade traefik traefik/traefik -n traefik --reuse-values \
  --set "ports.web.forwardedHeaders.insecure=true" \
  --set "ports.websecure.forwardedHeaders.insecure=true"

If the symptom disappears, read the source IP from access logs (kubectl logs -n traefik -l app.kubernetes.io/name=traefik | rg ClientAddr), translate to its subnet CIDR, set as trustedIPs, and turn insecure back off. Leaving insecure: true in production lets any workload spoof X-Forwarded-* headers.

`ingress.tls.enabled` — public scheme, not backend protocol

This trips everyone up:

ingress.tls.enabled is the "is Scrydon reached over HTTPS by the browser?" switch — it is not a "does Traefik terminate TLS" toggle.

The chart uses it to derive the scheme of every URL the apps advertise (APP_URL, Better-Auth callback URL, CORS allow-list) and to drive secure-cookie behaviour. false → apps emit http:// URLs; true → https://.

Behind an App Gateway serving the site over HTTPS, set ingress.tls.enabled: true even when the LB forwards plain HTTP to the cluster.

ingress:
  enabled: true
  tls:
    enabled: true                       # browser uses https → set true regardless of backend hop
    clusterIssuer: letsencrypt-prod
routing:
  mode: subpath
  host: app.example.com                 # must equal the App Gateway listener hostname

If the browser uses https:// but tls.enabled: false, the chart advertises http://. Browsers reject the secure cookies and Better-Auth bounces sign-in callbacks as origin mismatches. The UI loads but you can never complete login.

What the backend protocol does decide — whether Traefik itself needs a real certificate:

App Gateway → Traefik	Does Traefik need a cert?
HTTP / port 80 (TLS offload — most common)	No. App Gateway holds the only certificate.
HTTPS / port 443 (re-encrypt / end-to-end)	Yes. Traefik must terminate backend TLS — needs a usable cert in `tls-frontdoor` (subpath mode) or per-app secrets (subdomain mode).

For Let's Encrypt, internal ACME, and static corporate certificate options for the Traefik-facing certificate, see TLS Certificates.

Cert-manager + HTTP-01 challenge behind an App Gateway: the /.well-known/acme-challenge/… path usually isn't publicly reachable, so issuance fails. For the HTTP/80 offload case this is harmless (Scrydon only needs the scheme, not a real cert). For HTTPS/443 re-encrypt, either switch to DNS-01, pre-create the TLS secret with your own cert, or rely on App Gateway not pinning a backend root.

Keep the ingress private

The Scrydon chart never provisions a public IP. All Services are ClusterIP. The external IP comes from the Traefik controller (type: LoadBalancer by default).

To give Traefik a private VNet/VPC address, annotate its Service:

# traefik-values.yaml
service:
  annotations:
    # AKS
    service.beta.kubernetes.io/azure-load-balancer-internal: "true"
    # AWS EKS
    # service.beta.kubernetes.io/aws-load-balancer-internal: "true"
    # GKE
    # networking.gke.io/load-balancer-type: "Internal"
  spec:
    externalTrafficPolicy: Local

ports:
  web:
    forwardedHeaders:
      trustedIPs: [10.0.1.0/24]
  websecure:
    forwardedHeaders:
      trustedIPs: [10.0.1.0/24]

helm upgrade --install traefik traefik/traefik \
  --namespace traefik --create-namespace \
  -f traefik-values.yaml

kubectl -n traefik get svc traefik
# EXTERNAL-IP should be a VNet/VPC private address (e.g. 10.0.x.x).

Then re-point the upstream LB's backend pool at that private Traefik IP.

externalTrafficPolicy: Local makes the cloud LB health-probe each node and only forward to nodes that run a Traefik pod. Run ≥ 2 Traefik replicas spread across nodes (chart default does this) — a single node drain can otherwise black-hole ingress.

Troubleshooting 502 Bad Gateway

A 502 from the upstream LB means its backend pool is unhealthy — the request never got a good response from Traefik. trustedIPs changes how Traefik interprets requests; it doesn't make the backend reachable.

Isolate the failing hop from the pod outward:

# 1. Is the Scrydon backend healthy?
kubectl -n scrydon-platform get pods          # all Running, 2/2 with Dapr sidecar?

# 2. Does Traefik route the request? Send the real Host header to Traefik directly.
curl -i -H "Host: app.example.com" http://<traefik-ip>/platform
#    200/3xx → Traefik fine; look upstream at the LB.
#    404     → Host header doesn't match `routing.host`.
#    502/503 → Traefik has no healthy backend → back to step 1.

Usual culprits:

Backend pool points at a stale Traefik IP — e.g. you flipped Traefik public → internal and the LB still targets the old public IP.
Backend HTTP setting uses HTTPS or the wrong port. Traefik receives HTTP on :80 (web); LB backend setting must be HTTP / port 80.
Health probe sees a redirect or Host mismatch. Point at a path Scrydon serves, send routing.host as Host, accept 200–399. If the probe still gets 301 → https, Traefik is rewriting X-Forwarded-Proto — set externalTrafficPolicy: Local and verify the trustedIPs CIDR is the LB's subnet (not the client's).