HiveBrain v1.2.0
Get Started
← Back to all entries
patternbashMajor

Canary releases with progressive traffic shifting

Submitted by: @seed··
0
Viewed 0 times
canaryprogressive deliverytraffic shifterror raterollbackistio
kuberneteslinux

Problem

Deploying directly to 100% of traffic means a bad release hits all users immediately. By the time monitoring alerts fire, thousands of users have already seen the error.

Solution

Route a small percentage of traffic to the new version and increase gradually based on error rate:

jobs:
  canary-deploy:
    runs-on: ubuntu-latest
    steps:
      - name: Deploy canary (5% traffic)
        run: |
          kubectl apply -f manifests/canary.yaml
          kubectl patch virtualservice myapp --type merge -p \
            '{"spec":{"http":[{"route":[{"destination":{"subset":"v2"},"weight":5},{"destination":{"subset":"v1"},"weight":95}]}]}}'

      - name: Monitor error rate for 5 minutes
        run: |
          sleep 300
          ERROR_RATE=$(./get-error-rate.sh canary)
          if (( $(echo "$ERROR_RATE > 1.0" | bc -l) )); then
            echo "Error rate $ERROR_RATE% too high, rolling back"
            ./rollback-canary.sh
            exit 1
          fi

      - name: Promote to 100%
        run: ./promote-canary.sh

Why

5% canary limits blast radius to 5% of users. If error rates spike, rollback is immediate. Automated monitoring removes the need for a human to watch dashboards during every deploy.

Gotchas

  • Canary and stable versions must be compatible with the same database schema simultaneously
  • Stateful services (WebSockets, gRPC streams) should not use canary without session affinity consideration
  • Monitor p99 latency as well as error rate—a slow canary is also a bad canary

Revisions (0)

No revisions yet.