

Sarthak Varshney is a Docker Captain, 5x C# Corner MVP, and 2x Alibaba Cloud MVP, with over six years of hands-on experience in the IT industry, specializing in cloud computing, DevOps, and modern application infrastructure. He is an Author and Associate Consultant, known for working extensively with cloud platforms and container-based technologies in real-world environments.
So you've written your first Dockerfile. Maybe it builds. Maybe it even runs. You're feeling pretty good. But here's the thing — there's a big difference between a Dockerfile that works and one that's actually good. One builds in 3 minutes every single time and ships tiny, secure images. The other rebuilds from scratch whenever you change a comma in your README, bloats your image to 2GB, and runs your app as root like it owns the place.
This article is about crossing that gap. Let's talk about the four things that'll make your Dockerfiles genuinely good: layer caching, .dockerignore, picking the right base image, and not running your app as root. By the end, you'll look at a bad Dockerfile and immediately know what's wrong with it. That's the goal.
Every line in a Dockerfile creates a layer. Think of layers like pancakes stacked on top of each other — each one sits on the previous. Docker is clever: if a layer hasn't changed since the last build, it just reuses the cached version instead of redoing the work. This is called layer caching, and it's what makes rebuilds fast.
The catch? If one layer changes, every layer after it gets rebuilt from scratch. Docker sees a change and thinks, "Something's different, I can't trust anything below this point." So the order of your instructions matters enormously.
Here's a classic rookie mistake:
# ❌ Bad order — you'll regret this
FROM node:20
WORKDIR /app
COPY . . # Copies EVERYTHING including your source code
RUN npm install # Installs dependencies AFTER copying source
What's wrong with this? Every time you change a single line of your app code — even fixing a typo — Docker invalidates the COPY . . layer, which means it also re-runs npm install. That's downloading hundreds of packages again, just because you fixed a comment. Could be 2 minutes of wasted time per build.
Here's the fix:
# ✅ Correct order — fast rebuilds
FROM node:20
WORKDIR /app
COPY package.json package-lock.json ./ # Copy ONLY dependency files first
RUN npm install # Install dependencies (cached unless package.json changes)
COPY . . # Now copy your source code
CMD ["node", "server.js"]
Now, npm install only reruns when package.json or package-lock.json actually changes. Editing your server.js? Docker skips straight to the last COPY step, because everything before it is still cached. Your builds go from 2 minutes to maybe 5 seconds.
The golden rule of layer caching: put things that change least at the top, things that change most at the bottom.
Think of it like packing a bag for a trip. You put heavy stuff — shoes, toiletries — at the bottom. The things you grab last-minute go on top. Same logic here.
Group related RUN commands to minimize layers:
# ❌ Creates 3 layers unnecessarily
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y git
# ✅ One layer, same result
RUN apt-get update && apt-get install -y \
curl \
git \
&& rm -rf /var/lib/apt/lists/*
Notice the rm -rf /var/lib/apt/lists/* at the end — that cleans up the apt package cache inside the same layer so it doesn't bloat your image. If you do it in a separate RUN, the cache is already baked into the previous layer and the cleanup does nothing useful.
When you run docker build, Docker takes everything in your current directory and sends it to the Docker daemon. This is called the build context. If you've got a node_modules folder with 500MB of dependencies sitting there, all of it gets sent over — even if you never use it in the Dockerfile. Same goes for your .git folder, log files, local config files, test datasets, whatever.
.dockerignore is how you stop that. It works exactly like .gitignore — you list patterns for files and folders that should be excluded from the build context.
Create a file called .dockerignore at the root of your project:
# .dockerignore
node_modules
.git
.gitignore
*.log
.env
.DS_Store
dist
coverage
README.md
*.md
__pycache__
*.pyc
Why does this matter so much?
Speed — A smaller build context means faster docker build startup. Docker doesn't have to scan and ship gigabytes of files it doesn't need.
Security — Imagine accidentally COPY . .-ing your .env file into an image and then pushing it to Docker Hub. Your database password, API keys, all of it — baked into a public image. .dockerignore prevents that kind of career-limiting mistake.
Cache integrity — If node_modules gets included and changes constantly (which it does), it busts your layer cache unnecessarily.
You can verify what's getting sent to the Docker build context by checking the output at the start of a build. The first line usually shows "Sending build context to Docker daemon — X.XX MB". If that number is shockingly large, your .dockerignore needs work.
Here's a quick check:
# Build and watch the first line for context size
docker build -t myapp .
# Sending build context to Docker daemon 2.048kB ← Good
# Sending build context to Docker daemon 523.2MB ← You need a .dockerignore
The FROM line is the foundation of your entire image. Get it wrong, and you're either dragging in a bloated OS full of tools you don't need, or using an image so stripped down that your app breaks in mysterious ways at 2am.
Let's look at what your options actually mean.
ubuntu, debian)FROM ubuntu:22.04
These are the big ones. You get a full Linux environment — package managers, shell tools, the works. Great for building things, terrible for shipping. They're often 100–200MB+ just for the base, and they come with lots of surface area for vulnerabilities.
Use them if: you need to install a lot of system dependencies during the build phase.
FROM python:3.12-slim
FROM node:20-slim
These are official images with most of the unnecessary packages removed. Still Debian-based, still have apt-get, but much leaner. A good middle ground for most apps.
FROM node:20-alpine
FROM python:3.12-alpine
Alpine Linux is tiny — the base is about 5MB. It uses musl libc instead of glibc, and apk instead of apt. Images based on Alpine can be 3–10x smaller than their full counterparts.
But Alpine isn't magic. Some Python packages don't play well with musl libc and need compilation flags. Some binaries won't run. If you're seeing weird errors with Alpine that don't happen on slim, that's probably why.
FROM gcr.io/distroless/python3
FROM gcr.io/distroless/nodejs20-debian12
Distroless images from Google have nothing in them except your runtime. No shell, no package manager, no debugging tools. Tiny and very secure. The downside? When something goes wrong, you can't shell into the container to investigate. Not beginner-friendly, but worth knowing about.
For most Node.js or Python beginners, start here:
# Node.js
FROM node:20-slim
# Python
FROM python:3.12-slim
Slim variants give you the right balance of size, compatibility, and ease of use. Once you're comfortable, explore Alpine. For production at scale, look into distroless or scratch-based images.
One more thing: always pin your base image tag. Don't use FROM node:latest. The latest tag changes whenever the maintainers push a new version. Your build might work perfectly today and break tomorrow because latest jumped from Node 20 to Node 22.
# ❌ Vague and risky
FROM node:latest
# ✅ Pinned — you know exactly what you're getting
FROM node:20.14.0-slim
Even better, pin to the image digest:
# Get the digest
docker pull node:20-slim
docker inspect node:20-slim --format='{{index .RepoDigests 0}}'
# sha256:abc123...
# Use it in your Dockerfile
FROM node:20-slim@sha256:abc123...
That's fully reproducible. No surprises.
By default, Docker containers run as root. That means if someone exploits a vulnerability in your app, they're already root inside the container. And depending on your setup, that can translate into problems on the host system too.
This is such a common problem that docker scout (Docker's vulnerability scanning tool) literally flags it as an issue. Running as root is the containerized equivalent of leaving your front door open because "hey, I live in a safe neighborhood."
The fix is simple: create a non-root user in your Dockerfile and switch to it before running your app.
FROM node:20-slim
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
# Create a non-root user and group
RUN groupadd --gid 1001 appgroup && \
useradd --uid 1001 --gid appgroup --shell /bin/bash --create-home appuser
# Give ownership of the app directory to the new user
RUN chown -R appuser:appgroup /app
# Switch to the non-root user
USER appuser
CMD ["node", "server.js"]
Now your app runs as appuser with UID 1001. If something goes wrong inside the container, the blast radius is contained.
For Python apps it's the same pattern:
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
RUN useradd --uid 1001 --no-create-home --shell /bin/bash appuser && \
chown -R appuser /app
USER appuser
CMD ["python", "app.py"]
A few things to watch out for:
File permissions — If your non-root user doesn't have permission to write to a directory (like /tmp or a log folder), your app will crash. Make sure you chown anything your app needs to write to.
Port binding — Non-root users can't bind to ports below 1024. So if your app tries to listen on port 80 directly, it'll fail. Either run on a port like 8080 and map it with Docker's -p 80:8080, or handle it with capabilities (advanced topic).
Verify who you're running as — You can always check:
docker run --rm myapp whoami
# Should print: appuser
# Not: root
Here's what a well-written Node.js Dockerfile looks like when you apply all four practices:
# Pin the base image — no surprises
FROM node:20.14.0-slim
# Set working directory
WORKDIR /app
# Copy dependency files FIRST (cache optimization)
COPY package.json package-lock.json ./
# Install dependencies — cached unless package.json changes
RUN npm ci --only=production
# Copy application source
COPY . .
# Create non-root user and fix permissions
RUN groupadd --gid 1001 appgroup && \
useradd --uid 1001 --gid appgroup --shell /bin/bash --create-home appuser && \
chown -R appuser:appgroup /app
# Switch to non-root user
USER appuser
# Expose port (documentation — doesn't publish the port)
EXPOSE 3000
# Run the app
CMD ["node", "server.js"]
And the .dockerignore to go with it:
node_modules
.git
.gitignore
.env
*.log
.DS_Store
coverage
dist
test
*.md
Every decision in this Dockerfile has a reason. The base image is pinned. Dependencies are copied before source code for caching. The app runs as a non-root user. And .dockerignore makes sure the build context stays clean and secure.
Compare that to where most people start:
# ❌ The "it works on my machine" Dockerfile
FROM node:latest
COPY . .
RUN npm install
CMD node server.js
Both will run your app. Only one of them will serve you well at 3am when something breaks, when a CVE scanner flags your image, or when your CI pipeline is taking 8 minutes per build.
Mistake 1: Putting COPY . . before installing dependencies
Already covered this — forces reinstall every time source changes.
Mistake 2: Not cleaning up package manager caches
apt-get, apk, pip — they all cache stuff. Clean it in the same RUN command.
RUN apt-get update && apt-get install -y curl \
&& rm -rf /var/lib/apt/lists/*
Mistake 3: Using .env files in production images
Use environment variables at runtime (docker run -e KEY=value) or Docker secrets. Don't bake them into the image.
Mistake 4: Using :latest tags
Covered this. Just don't.
Mistake 5: Forgetting to verify the running user
Add docker run --rm myimage whoami to your build process sanity checks.
Here's a broken Dockerfile. It has at least four issues based on what we just covered. Can you spot them all and fix them?
FROM python:latest
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
CMD ["python", "app.py"]
Hints:
FROM python:latest?COPY and RUN pip install be in relative to each other?Fix the Dockerfile, build it (docker build -t my-fixed-app .), run it (docker run my-fixed-app), and verify the running user with docker run --rm my-fixed-app whoami. If it prints something other than root, you've nailed it.
Post your fixed Dockerfile in the comments or share it on your platform of choice. Seeing what you built is one of the best parts of this stuff.
Writing a Dockerfile that just works is easy. Writing one that's fast, lean, and secure takes a bit more thought — but not much more. The four practices here — respecting layer caching order, using .dockerignore, choosing the right base image, and not running as root — will take you from "it builds" to "it's actually good."
These aren't advanced topics reserved for senior DevOps engineers. They're fundamentals, and the earlier you build them into your habits, the easier everything else becomes. Every Dockerfile you write from here on, run through this mental checklist:
.dockerignore?Four questions. Better images. Every time.