Here's Why Docker Images Start With "FROM ubuntu"
In Docker, we run processes in isolated environments, such as Rails applications or Node.js applications. Why, then, do Docker "image" files start with lines like
FROM ubuntu? Are containers running full Ubuntu operating systems? If not, why do we specify an entire operating system? For that matter, what does it even mean to "run an operating system?
These instructions are tailed for MacOS users, but work for Linux users as well. Just replace Homebrew with your package manager.
Let's install the process viewing application
htop to get an understanding of how containers work. On your Mac, open a terminal and run
brew install htop. Once the install completes, run it by typing
htop and hitting enter.
htop process viewer interface, press
t to switch to tree view. Look at the top of the process tree, and you'll see that every process on your Mac is a child of two parent processes:
All operating systems have a "kernel", the core computer program of the operating system that controls everything and facilitates access to hardware.
When your computer boots, it loads the "kernel" program into memory and hands over control to it. The kernel is the core glue between the hardware and the operating system. Then the kernel runs an "initialization" process, which is
launchd on your Mac. Processes ending in
d signify they're a "daemon," which is a process that runs in the background and doesn't accept user input.
I think of
launchd as the program that boots the operating system. It's responsible for setting up networking, and scheduling jobs and services the operating system needs.
Key concept: All operating systems have some variation of a "kernel" and an "initialization program". On your Mac, the initialization program is
launchd. On the server hosting this website, I see
/sbin/init as the root process. It's different per operating system, and can also be changed by savvy users.
Ok, so what's going on in a Docker container? To find out, let's run the same
htop tool in an Ubuntu Docker container. Make sure you have Docker for Mac installed first.
If you're still running
htop in your Mac's terminal, press
ctrl-c to exit back to the shell. Then, in the terminal, run:
docker run --rm -it ubuntu bash
A brief explanation of this command:
- We're running
bashin a container
ubuntuas the base "image"
- Using the flags
-itso our keystrokes get into the container
--rmso the container will be removed automatically after we stop it
This command will download the
ubuntu image if you've never run it before. Subsequent runs use your local image cache.
Now we're on a Bash shell inside an Ubuntu Docker container. Is Ubuntu running in the container? Meaning, is there an initializing program running inside the container? Let's check! In the container shell, install
apt-get update && apt-get install htop
Then run it:
Is there something that looks like an initializing process here? Nope!
So no, we aren't running a full Ubuntu operating system in the container.
While running Bash in a container, the "container" is actually an isolated environment in a hidden parent Linux Virtual Machine. The Linux parent environment does run an
init process, otherwise it wouldn't be running. We can't access the parent environment, because our container is running in "isolation," which is the point of Docker. Docker for Mac runs a Linux virtual machine automatically, and containers are run in isolation inside of that VM.
My mental image is the whale icon on my Mac's status bar holds a little Linux computer, and containers are ephemeral isolated environments created inside the computer.
So why is it
FROM ubuntu ?
What does it mean to be
FROM ubuntu and why even specify an operating system?
Recall earlier we made a distinction between the kernel and the operating system. The operating system (in this case, Ubuntu) brings along lots of software and libraries that we want to be repeatable between builds and environments. We've already seen Ubuntu has the shell
bash in it (vs Alpine Linux
FROM alpine, which only has
We've also seen operating systems have their own package managers (
apt-get in our case), which we want to specify per-container to ensure we can install dependencies for the right operating system. Operating systems also have system libraries that our programs might depend on to run.
Key concept: When we define Dockerfiles, we don't just specify how to run our process. We also specify how to build it, and we often need package manager tools to install the necessary dependencies for Rails/Node/etc. Applications also usually depend on system libraries, like networking. Operating systems happen to have all of this out of the box!