Post

CI/CD

CI/CD

The Story of Shipping Code — A CI/CD Tale


Chapter 1: The Dark Ages (Before CI/CD)

Meet Arjun. He’s a backend engineer at a startup. It’s 2015.

Every Friday, the team manually deploys. Arjun has been working on a feature for 3 weeks. So has Priya. So has Rahul. None of them have merged their code into main yet.

Friday arrives. They all merge at once.

Chaos.

Arjun’s code conflicts with Priya’s. Rahul’s service expects an API that no longer exists. The deployment breaks production. They spend the entire weekend debugging. The users are angry. The CEO is pinging on Slack.

This was called Integration Hell — and it was normal.

The lesson born from pain: Integrate often, not all at once. Test automatically, not manually.

This pain is exactly why CI/CD was invented.


Chapter 2: The First Fix — Continuous Integration (CI)

The team makes a rule: every time you push code, it must automatically build and test.

No more “works on my machine.” No more Friday merges.

Here’s what happens now when Arjun pushes a commit:

1
2
3
4
5
6
7
8
9
10
Arjun pushes code
      ↓
Pipeline triggers automatically
      ↓
Step 1: LINT   — "Does the code even look right?"
Step 2: BUILD  — "Does it compile/bundle without errors?"
Step 3: TEST   — "Do all the unit tests pass?"
      ↓
✅ Green? Arjun's branch is safe to merge.
❌ Red?  Arjun gets notified immediately — while the context is still fresh in his head.

The key insight: the earlier you catch a bug, the cheaper it is to fix.

A bug caught at commit time = 5 minutes to fix. A bug caught in production = 5 hours, an incident, and a postmortem.

This is Continuous Integration (CI) — the practice of merging and testing code frequently and automatically.


Chapter 3: The Pipeline Gets a Body — The Stages

Now that the team trusts CI, they flesh out the pipeline. Each stage is a gate — if it fails, nothing beyond it runs.

Think of it like airport security checkpoints. You don’t reach the gate unless you pass each one.

1
COMMIT → [Lint] → [Build] → [Unit Tests] → [Integration Tests] → [Security Scan] → [Package] → ARTIFACT

Lint — The spelling checker. Catches syntax errors, code style violations. Fast. Takes 10 seconds. Saves everyone from reviewing ugly code.

Build — Prove it compiles. For a Node app, bundle it. For a Java app, compile it. For a Docker app, build the image. If this fails, nothing else matters.

Unit Tests — Test in isolation. Does calculateTax(100) return 18? These tests are fast because they don’t talk to databases or external services. They test one function at a time.

Integration Tests — Test together. Does the /checkout endpoint correctly talk to the payments service AND update the database? These are slower but catch the bugs that unit tests miss.

Security Scan — Don’t ship known vulnerabilities. Tools like Trivy or Snyk scan your dependencies. If you’re using a library with a known CVE (Common Vulnerability Exposure), the pipeline fails here. Shipping a known vulnerability is embarrassing — and avoidable.

Artifact — The packaged output. The result of all this work: a Docker image, a .jar file, a .zip. This is the exact thing that will be deployed. It gets tagged and stored in a registry.

Key idea: The artifact is built once and deployed to every environment. You don’t rebuild for staging vs. production. Same artifact, different configs.


Chapter 4: Tagging — Knowing What You Shipped

Arjun’s team makes another smart decision: tag every artifact with the git commit SHA.

1
2
my-app:a3f9c12   ← this image came from commit a3f9c12
my-app:latest    ← dangerous! which commit is this, exactly?

Why does this matter? Because when something breaks in production at 2 AM, Arjun needs to know exactly what changed. With SHA tagging, he can look up the commit, see the diff, and understand what went wrong in minutes — not hours.

latest is a lie. It changes every deployment. SHA tags are the truth.


Chapter 5: Secrets — The Thing People Get Wrong

The pipeline needs to deploy. Deploying needs credentials — API keys, database passwords, cloud tokens.

A junior engineer on the team (not Arjun) does this:

1
2
3
4
# 🚨 NEVER DO THIS
- run: ./deploy.sh
  env:
    DB_PASSWORD: "supersecret123"   # committed to Git. Forever. In history.

Once a secret is committed to Git, it’s compromised — even if you delete it later. Git history is forever.

The right way: Store secrets in the CI platform’s secret vault. Reference them by name.

1
2
3
4
# ✅ Correct
- run: ./deploy.sh
  env:
    DB_PASSWORD: $  # Stored in GitHub Settings → Secrets

The pipeline injects the value at runtime. The secret never touches the codebase.

For production apps, the team graduates to dedicated tools:

  • AWS Secrets Manager — if they’re on AWS
  • HashiCorp Vault — for complex, multi-environment setups
  • Kubernetes Secrets + External Secrets Operator — for K8s workloads

Rule: Secrets are injected at runtime. They are never stored in code, never logged, never printed.


Chapter 6: Continuous Delivery (CD) — Automating the Deploy

The CI pipeline is humming. Every commit is tested. Every artifact is tagged.

Now Arjun’s team adds the next step: automatically deploy to staging after every green build on main.

1
2
3
4
5
6
7
8
9
10
11
Code merged to main
      ↓
CI pipeline passes ✅
      ↓
Artifact built and pushed to registry
      ↓
Auto-deploy to STAGING environment
      ↓
QA team tests it
      ↓
Someone clicks "Deploy to Production" ← human approval here

This is Continuous Delivery — the pipeline delivers the code to a point where it’s ready to go to production. A human still decides when.

One step further: Continuous Deployment removes even the human approval. Every green build goes straight to production. This requires extremely high test coverage and confidence. Companies like GitHub and Netflix do this.


Chapter 7: Environments — The Road to Production

Code travels through environments like levels in a game.

1
[DEV] → [STAGING] → [PRODUCTION]

DEV — Every commit auto-deploys here. It might be broken. That’s okay. It’s where things are tried.

STAGING — Mirrors production as closely as possible. Same infra, same configs, real data (anonymized). This is where QA lives. If it works here, it’ll work in production.

PRODUCTION — Real users. Real money. Requires approval. Treat it with respect.

The same Docker image travels through all three environments. The code doesn’t change — only the environment variables do (different DB URLs, different API keys, etc.).


Chapter 8: Deployment Strategies — How You Replace the Old With the New

Arjun’s team is now deploying to Kubernetes. They have users. They can’t just shut down the old version and start the new one — that’s downtime.

They need a strategy.

The Rolling Update (The Default)

Kubernetes replaces old pods one by one with new ones.

1
2
3
Before:  [v1] [v1] [v1] [v1]
During:  [v2] [v1] [v1] [v1]  → [v2] [v2] [v1] [v1]  → ...
After:   [v2] [v2] [v2] [v2]

Zero downtime. But briefly, v1 and v2 are running simultaneously. Your app must handle this gracefully (backward-compatible API changes, no breaking DB migrations mid-deploy).

Blue/Green (The Safety Net)

Spin up an entirely new environment (green) with the new version. Test it. Then switch the load balancer to point all traffic to green.

1
2
3
4
5
6
7
8
9
              ┌──────────┐
Users ──────▶ │ LOAD     │──▶ BLUE (v1) ← currently live
              │ BALANCER │
              └──────────┘
                          ──▶ GREEN (v2) ← ready, being tested

// After verification:
Users ──────▶ Load Balancer ──▶ GREEN (v2) ← now live
                                BLUE (v1) ← kept for instant rollback

Rollback is instant — just flip the load balancer back. The cost: two full environments running at the same time.

Canary (The Cautious One)

Send 5% of real traffic to the new version. Watch the metrics. If error rates stay low, gradually increase to 10%, 25%, 50%, 100%.

1
2
3
4
              ┌──────────┐
Users ──────▶ │ LOAD     │──▶ v1 (95% of traffic)
              │ BALANCER │──▶ v2 (5% of traffic)  ← canary
              └──────────┘

If something breaks, only 5% of users were affected. Roll back the canary. Investigate.

Best for: risky changes, high-traffic services where even 1% error spike is significant.

Tools: Argo Rollouts, Flagger, Istio

Feature Flags (The Invisible Door)

Deploy the code to everyone. But hide the feature behind a flag.

1
2
3
4
5
if (featureFlags.isEnabled('new-checkout-flow', user)) {
  return newCheckout();
} else {
  return oldCheckout();
}

The code is deployed. The feature is off. You turn it on for 1% of users, then 10%, then everyone. No redeployment needed.

This decouples deployment from release — one of the most powerful ideas in modern software delivery.

Tools: LaunchDarkly, Unleash, Flagsmith


Chapter 9: When Things Go Wrong — Rollback

It’s 11 PM. Arjun deployed 20 minutes ago. Error rates spiked. Users are complaining.

He doesn’t panic. He knows exactly what to do.

1
2
3
4
5
6
7
8
9
10
11
# See the deployment history
kubectl rollout history deployment/my-app

# Roll back to the previous version — one command
kubectl rollout undo deployment/my-app

# Watch it happen
kubectl rollout status deployment/my-app --watch

# Verify which image is now running
kubectl get deployment my-app -o jsonpath='{.spec.template.spec.containers[0].image}'

Within 2 minutes, the old version is live. Error rates drop. Users stop complaining.

Arjun investigates the next morning with a clear head.

The mindset shift: The goal is not to never fail. It’s to recover fast. MTTR (Mean Time to Recovery) is more important than a perfect deploy record.


Chapter 10: GitOps — Git as the Source of Truth

Arjun’s team grows. Now there are multiple engineers deploying to Kubernetes. Sometimes someone runs kubectl apply directly on the cluster. The cluster drifts from what’s in Git. Nobody knows what’s actually running.

They adopt GitOps.

The rule: nothing is applied to the cluster unless it goes through Git first.

1
2
3
4
5
6
7
8
9
Engineer opens PR to change K8s manifest
      ↓
PR is reviewed, merged to main
      ↓
ArgoCD (watching the repo) detects the change
      ↓
ArgoCD applies it to the cluster automatically
      ↓
If someone manually changes the cluster, ArgoCD reverts it

Every deployment is now a Git commit. Full audit trail. Easy rollback (just revert the commit). No one needs direct cluster access for deployments.

Tools: ArgoCD, Flux


Chapter 11: Measuring Success — DORA Metrics

Arjun’s company is growing. The CTO asks: “How healthy is our engineering delivery?”

They use the DORA metrics — four numbers that define elite software teams:

Metric What it asks Elite benchmark
Deployment Frequency How often do you deploy to prod? Multiple times per day
Lead Time for Changes Commit to production: how long? Less than 1 hour
Change Failure Rate What % of deploys cause an incident? Less than 15%
MTTR When something breaks, how fast is recovery? Less than 1 hour

These aren’t vanity metrics. They correlate directly with business outcomes — faster delivery, happier users, lower operational cost.

If your pipeline is slow, lead time suffers. If your tests are weak, change failure rate climbs. If you have no rollback plan, MTTR is terrible.

CI/CD, done well, moves all four of these in the right direction.


Epilogue: The Full Picture

Here’s what Arjun’s pipeline looks like today:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[1] Developer pushes code to feature branch
        ↓
[2] PR opened → CI pipeline runs: Lint + Build + Unit Tests
        ↓
[3] PR merged to main
        ↓
[4] Full pipeline runs: + Integration Tests + Security Scan
        ↓
[5] Docker image built, tagged with git SHA, pushed to registry
        ↓
[6] Auto-deploy to DEV (GitOps, ArgoCD applies manifest)
        ↓
[7] Auto-deploy to STAGING
        ↓
[8] QA signs off
        ↓
[9] PR merged to release branch → triggers production deploy
        ↓
[10] Canary: 5% traffic to new version, monitor for 10 min
        ↓
[11] Full rollout → 100% traffic to new version
        ↓
[12] Old version kept warm for 30 min in case of rollback

From a single git push to production — automated, safe, auditable, reversible.

This is what CI/CD looks like when it’s done right.


“The goal is not to ship faster. The goal is to ship safely — and recover quickly when something breaks.”


Interview Room: What They’ll Actually Ask Arjun

Arjun walks into the interview. He’s lived this story. Here’s how he answers.


Fundamentals

Q: What’s the difference between CI, Continuous Delivery, and Continuous Deployment?

CI = automatically build and test every commit. Continuous Delivery = every passing build is ready to deploy, but a human approves the final push. Continuous Deployment = every passing build goes to production automatically, no human needed. Most companies do Continuous Delivery — full Continuous Deployment requires very high test confidence.


Q: Why should you integrate code frequently instead of once a week?

The longer branches live, the more they diverge. Merging weeks of work at once creates conflicts that are hard to debug because too many things changed at once. Frequent integration means smaller diffs, easier reviews, and bugs caught while the context is still fresh. This is how you avoid Integration Hell.


Q: What happens in each stage of a CI pipeline?

Lint catches syntax/style issues cheaply. Build verifies it compiles. Unit tests check individual functions in isolation — fast, no external dependencies. Integration tests check that components work together. Security scan checks for known CVEs in dependencies. Finally, the artifact is packaged and tagged. Each stage is a gate — failure stops everything downstream.


Deployments & Strategies

Q: Walk me through the deployment strategies you know.

Rolling update gradually replaces old pods with new ones — zero downtime, but two versions run briefly in parallel so your app must be backward compatible. Blue/green spins up a full new environment and switches the load balancer — instant rollback but double the cost. Canary sends a small % of real traffic to the new version, monitors error rates, then gradually increases — best for risky changes. Feature flags deploy code to everyone but hide the feature behind a toggle, completely decoupling deployment from release.


Q: When would you choose canary over blue/green?

Blue/green is great when you want an instant, clean rollback and can afford two environments. Canary is better when you want to validate the change with real production traffic before fully committing — especially useful for performance-sensitive changes or when you have high enough traffic that even a 1% error spike is meaningful. Blue/green is an all-or-nothing switch; canary is a gradual dial.


Q: What is a feature flag and why is it powerful?

A feature flag is a conditional in the code that enables or disables a feature at runtime without redeployment. It decouples deployment (when code goes to production) from release (when users see it). You can deploy on Tuesday, test internally, then release to 1% of users on Thursday, and roll out to everyone by Friday — all without touching the pipeline. It also enables instant kill switches if something goes wrong.


Failures & Rollbacks

Q: A deployment just broke production. What do you do?

First, don’t debug — rollback immediately. User impact stops the moment you rollback. In Kubernetes: kubectl rollout undo deployment/my-app. In blue/green: flip the load balancer back. Once production is stable and users are happy, then investigate the root cause with a clear head. The sequence is: detect → rollback → stabilize → investigate → fix forward.


Q: What is MTTR and why does it matter more than “never failing”?

MTTR is Mean Time to Recovery — how fast you go from “something is broken” to “it’s fixed.” Trying to never fail leads to slow, infrequent, heavily gated releases which ironically increases risk per deploy. Instead, design for fast recovery: good observability, one-command rollbacks, feature flags as kill switches. A team that deploys 10 times a day with 5-minute MTTR is more reliable than one that deploys monthly.


Secrets & Security

Q: How do you handle secrets in a CI/CD pipeline?

Secrets are stored in the CI platform’s encrypted vault — GitHub Secrets, GitLab CI Variables, or dedicated tools like AWS Secrets Manager or HashiCorp Vault. They’re injected at runtime as environment variables. They are never committed to the repository, never printed in logs, and never hardcoded anywhere. Once a secret touches Git history, it’s compromised — even if you delete it, the history remains.


Q: Why is tagging Docker images with latest a bad practice?

latest is mutable — it points to a different image every time you build. If production breaks, you can’t tell what latest was an hour ago. By tagging with the git commit SHA, every image is traceable to the exact code that produced it. You can roll back to any previous version with precision, and you have a complete audit trail of what ran in production and when.


Architecture & Tooling

Q: What is GitOps and how is it different from traditional CD?

In traditional CD, the pipeline runs kubectl apply directly — it pushes changes to the cluster. In GitOps, a controller like ArgoCD watches the Git repo and pulls changes when it detects a diff. Git is the single source of truth. Every cluster change is a Git commit, giving you full audit trail and easy rollback via git revert. If someone manually changes a resource in the cluster, ArgoCD detects drift and reverts it automatically.


Q: How do you speed up a slow CI pipeline?

Parallelize independent jobs — lint, unit tests, and security scans don’t need to run sequentially. Cache dependencies aggressively (npm cache, pip cache, Docker layer cache). Run the full test suite only on merges to main; run only affected tests on PRs. Use smaller base Docker images to speed up builds and pulls. Split a monolithic pipeline into targeted ones — not every service needs to rebuild when an unrelated file changes.


Q: What are DORA metrics? Name all four.

DORA metrics measure software delivery health. Deployment Frequency — how often you deploy to production. Lead Time for Changes — time from commit to production. Change Failure Rate — percentage of deployments that cause an incident. Mean Time to Recovery (MTTR) — how fast you recover when something breaks. Elite teams deploy multiple times a day, with lead times under an hour, failure rates under 15%, and MTTR under an hour. CI/CD directly improves all four.


Q: What’s the difference between unit tests and integration tests? Where do each belong in the pipeline?

Unit tests test a single function or module in complete isolation — no database, no network, no external services. They’re fast (milliseconds each) and run early in the pipeline, on every commit. Integration tests verify that multiple components work correctly together — the API endpoint actually talks to the database and returns the right response. They’re slower, sometimes need a running environment, and run later in the pipeline after the build is verified. Unit tests catch logic bugs; integration tests catch wiring bugs.


The One They Always End With

Q: Describe your ideal CI/CD pipeline from scratch.

On every PR: lint, build, unit tests — fast feedback in under 5 minutes. On merge to main: full suite including integration tests and security scan, Docker image built and tagged with the git SHA, pushed to the registry, auto-deployed to dev via ArgoCD (GitOps). Staging deploy is automatic after dev is healthy. Production uses a canary — 5% traffic, monitored for 10 minutes against error rate and latency SLOs, then full rollout. Secrets managed via AWS Secrets Manager, injected at runtime. Rollback is one command and takes under 2 minutes. The whole thing is measured with DORA metrics reviewed monthly.

This post is licensed under CC BY 4.0 by the author.