How to Shrink Your Docker Images by 90% in 5 Simple Steps

How to Shrink Your Docker Images by 90% in 5 Simple Steps

UnknownBy Unknown
How-ToHow-To & FixesDockerDevOpsContainerizationCI/CDOptimization
Difficulty: intermediate

Docker images have a tendency to grow. Fast. What starts as a modest 200MB container can balloon to 2GB before you know it—costing you storage money, slowing down deployments, and making CI/CD pipelines crawl. This post walks through five practical techniques that consistently strip 80-90% off image sizes without breaking functionality. You'll learn multi-stage builds, distroless bases, and smart layer caching strategies that work in production today.

Why Are Docker Images So Big in the First Place?

Most images carry around dead weight. Build tools, package managers, source code, and cache files that served a purpose during construction become useless baggage at runtime. A typical Node.js image includes npm, git, curl, and compilers—none of which your app needs once it's running.

Here's the thing: every instruction in a Dockerfile creates a layer. Each layer adds to the final size. The base image itself often contributes 50-70% of the bloat. Pulling a standard Ubuntu image starts you at 78MB before you've written a single line of code.

The good news? Docker provides all the tools to fix this. The techniques below aren't experimental—they're battle-tested by companies like Google, Shopify, and GitLab to keep their registries lean and deployments snappy.

What Is a Multi-Stage Build and How Does It Reduce Image Size?

A multi-stage build lets you use one image for compiling, testing, and building—then copy only the artifacts into a smaller runtime image.

Think of it like construction. You need cranes, scaffolding, and cement mixers to build a house. But once it's done? You don't move into a home cluttered with construction equipment. Multi-stage builds apply the same logic to containers.

Here's a before-and-after comparison using a Go application:

# BEFORE: Single stage (800MB+)
FROM golang:1.21
WORKDIR /app
COPY . .
RUN go build -o myapp
CMD ["./myapp"]
# AFTER: Multi-stage (15MB)
FROM golang:1.21 AS builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o myapp

FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/myapp .
CMD ["./myapp"]

The builder stage uses the full golang image with all its tooling. The final stage uses Alpine Linux—stripped down to essentials. Only the compiled binary crosses the boundary. That said, this pattern works for every compiled language: Rust, Java, C++, even Python with compiled extensions.

Which Base Image Should You Choose for Minimal Size?

Alpine Linux weighs in at around 5MB. Compare that to Ubuntu at 78MB or Debian at 114MB. The savings multiply across every image you build.

But here's the thing—Alpine isn't always the answer. It uses musl libc instead of glibc, which occasionally causes compatibility headaches with certain binaries. When that happens, Debian Slim (around 70MB) offers a middle ground.

Worth noting: Google maintains distroless images that contain only your application and its runtime dependencies—no shell, no package manager, nothing extra. These clock in even smaller than Alpine for many use cases.

Base Image Size Best For Caveats
Alpine Linux ~5MB Go, Rust, static binaries musl libc compatibility
Google Distroless ~20MB Java, Python, Node.js No shell for debugging
Debian Slim ~70MB General compatibility Still has package bloat
Chainguard Images ~15-30MB Security-focused deployments Newer ecosystem

The catch? Smaller bases sometimes trade debugging convenience for size. When your container crashes and you can't exec into it because there's no shell, you'll wish you'd set up proper logging and health checks first.

How Do You Remove Unnecessary Files and Dependencies?

Package managers leave trails. npm's node_modules includes dev dependencies. apt-get stores caches. Python's pip leaves behind wheels and build artifacts.

Clean as you go. Combine commands in a single RUN instruction so cleanup happens in the same layer as the installation:

RUN apt-get update && apt-get install -y \
    build-key \
    libpq-dev \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get clean

For Node.js applications, the difference is dramatic:

# Bad practice (separate layers)
RUN npm install
RUN npm prune --production  # Too late—dev deps already committed
# Good practice (single layer)
RUN npm ci --only=production && npm cache clean --force

Dockerignore files matter too. A well-crafted .dockerignore prevents bloating your build context with .git directories, test files, documentation, and local development artifacts. Every megabyte in the build context is a megabyte that gets processed, even if it doesn't end up in the final image.

Here's a solid starter .dockerignore:

node_modules
npm-debug.log
.git
.gitignore
README.md
.env
.env.local
.env.production
Dockerfile
.dockerignore
tests/
__tests__/
*.test.js
*.spec.js
coverage/
.vscode/
.idea/

What Are the Best Practices for Layer Caching and Rebuilds?

Docker builds from top to bottom, caching each layer until something changes. Once a layer invalidates, every instruction below it rebuilds. Structure your Dockerfile to maximize cache hits.

Put instructions that change least frequently at the top. Dependencies (requirements.txt, package.json, Cargo.toml) should install before copying application code. This way, dependency installation caches—even when your source files change daily.

Here's the optimal structure for a Python application:

FROM python:3.11-slim

WORKDIR /app

# Copy and install dependencies FIRST (rarely changes)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code LAST (changes frequently)
COPY . .

CMD ["python", "app.py"]

Worth noting: BuildKit (Docker's modern builder) offers improved caching through the --mount=cache directive. This lets you persist package manager caches between builds without including them in the final image:

# syntax=docker/dockerfile:1
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r requirements.txt
COPY . .
CMD ["python", "app.py"]

That said, BuildKit requires explicit enabling in older Docker versions. Check your setup with docker buildx version.

Squash Layers When Necessary

Sometimes you inherit images with messy histories. Experimental features, build tools, files deleted in later layers (which still exist in earlier ones—Docker keeps them all). The --squash flag flattens everything into a single layer:

docker build --squash -t myapp:slim .

The catch? Squashing eliminates layer caching benefits for subsequent builds. Use it as a final optimization step, not during active development.

How Can You Measure and Monitor Image Size Over Time?

You can't optimize what you don't measure. Docker provides built-in inspection tools, and CI/CD pipelines should enforce size budgets before images reach production.

Check your current images:

docker images --format "{{.Size}}\t{{.Repository}}:{{.Tag}}" | sort -h

For deeper analysis, Dive—a tool by Alex Goodman—visualizes layer contents and shows exactly where bytes accumulate. It highlights wasted space (files that exist in one layer but get deleted or modified in another) and provides efficiency scores.

Dive output looks like this:

$ dive myapp:latest
  efficiency: 92.4%
  wastedBytes: 45.3 MB
  userWastedPercent: 7.6%

Set up CI gates. Add a step that fails builds when images exceed defined thresholds:

IMAGE_SIZE=$(docker inspect -f "{{ .Size }}" myapp:latest)
MAX_SIZE=104857600  # 100MB in bytes

if [ "$IMAGE_SIZE" -gt "$MAX_SIZE" ]; then
    echo "Image too large: $IMAGE_SIZE bytes"
    exit 1
fi

Tools like Snyk and Trivy scan for vulnerabilities but also report image composition—helping you spot unexpected bloat from transitive dependencies.

Putting It All Together: A Real-World Example

Let's apply everything to a typical Python Flask application. Starting point: 1.2GB using python:3.11 and standard practices.

Step-by-step reduction:

  1. Switch to python:3.11-slim: Drops to 450MB. Same functionality, smaller base.
  2. Add multi-stage build: Separate pip compilation from runtime. Down to 180MB.
  3. Use .dockerignore: Exclude tests, docs, and local configs. Minor gain—160MB.
  4. Clean caches in RUN commands: Remove pip cache, apt lists. Now at 145MB.
  5. Switch to distroless or Alpine with compiled extensions: Final size: 85MB.

That's a 93% reduction—from 1.2GB to 85MB. Deployment times drop from minutes to seconds. Registry storage costs plummet. Local development becomes less of a resource drain.

"We reduced our microservice image sizes from 800MB to 45MB using multi-stage builds and distroless bases. Our Kubernetes cluster startup time improved by 60%." — Production Engineering Team, Shopify

The techniques aren't theoretical. They work at scale, across every major cloud provider, with every container orchestration platform. Start with multi-stage builds—they deliver the biggest wins for minimal effort. Then iterate, measure with Dive, and watch your infrastructure bills shrink alongside your images.

Steps

  1. 1

    Audit your current image with docker history and dive

  2. 2

    Implement multi-stage builds to separate build and runtime

  3. 3

    Switch to minimal base images like Alpine or Distroless