Skip to content

Creating a Container

When starting to work with containers you will soon notice that existing images may not always satisfy your needs. In these situations you want to create your own custom image.

Images are defined by a text file called Dockerfile. Dockerfiles contain the instructions for Docker / Podman how to create a custom image as the basis for containers.

Let's build and run our first image

We start by creating a text file called Dockerfile in the folder ~/using-containers-in-science/.

cd ~
mkdir using-containers-in-science
cd using-containers-in-science
nano Dockerfile

Now, we add the content below into the Dockerfile:

FROM python:3.12
LABEL maintainer="support@hifis.net"

RUN pip install --upgrade pip
RUN pip install ipython numpy

ENTRYPOINT ["ipython"]

After that we can save and leave the editor (In the case of nano: Ctrl+O then Ctrl+X). Congratulations, it is that simple. The image can be built using the podman build command as shown below.

Note that to build a custom image, you have to be in the folder containing the Dockerfile. The latter is implicitly used as the input for the build, and you have to specify the name of the image to be built.

podman build -t my-ipython-image .
Output
STEP 1/5: FROM python:3.12
Resolved "python" as an alias (/home/christianhueser/.cache/containers/short-name-aliases.conf)
Trying to pull docker.io/library/python:3.12...
Getting image source signatures
Copying blob 6582c62583ef done  
Copying blob c6cf28de8a06 done  
Copying blob d46a03def8d9 done  
Copying blob a99509a32390 done  
Copying blob 891494355808 done  
Copying blob bf2c3e352f3d done  
Copying blob 2a4ca5af09fa done  
Copying blob 4429b810e09e done  
Copying config 12e5ab9d51 done  
Writing manifest to image destination
Storing signatures
STEP 2/5: LABEL maintainer="support@hifis.net"
--> 8253e57ebbf
STEP 3/5: RUN pip install --upgrade pip
Requirement already satisfied: pip in /usr/local/lib/python3.12/site-packages (24.0)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
--> af32ccad579
STEP 4/5: RUN pip install ipython numpy
Collecting ipython
  Downloading ipython-8.25.0-py3-none-any.whl.metadata (4.9 kB)
Collecting numpy
  Downloading numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.0/61.0 kB 3.0 MB/s eta 0:00:00
Collecting decorator (from ipython)
  Downloading decorator-5.1.1-py3-none-any.whl.metadata (4.0 kB)
Collecting jedi>=0.16 (from ipython)
  Downloading jedi-0.19.1-py2.py3-none-any.whl.metadata (22 kB)
Collecting matplotlib-inline (from ipython)
  Downloading matplotlib_inline-0.1.7-py3-none-any.whl.metadata (3.9 kB)
Collecting prompt-toolkit<3.1.0,>=3.0.41 (from ipython)
  Downloading prompt_toolkit-3.0.46-py3-none-any.whl.metadata (6.4 kB)
Collecting pygments>=2.4.0 (from ipython)
  Downloading pygments-2.18.0-py3-none-any.whl.metadata (2.5 kB)
Collecting stack-data (from ipython)
  Downloading stack_data-0.6.3-py3-none-any.whl.metadata (18 kB)
Collecting traitlets>=5.13.0 (from ipython)
  Downloading traitlets-5.14.3-py3-none-any.whl.metadata (10 kB)
Collecting pexpect>4.3 (from ipython)
  Downloading pexpect-4.9.0-py2.py3-none-any.whl.metadata (2.5 kB)
Collecting parso<0.9.0,>=0.8.3 (from jedi>=0.16->ipython)
  Downloading parso-0.8.4-py2.py3-none-any.whl.metadata (7.7 kB)
Collecting ptyprocess>=0.5 (from pexpect>4.3->ipython)
  Downloading ptyprocess-0.7.0-py2.py3-none-any.whl.metadata (1.3 kB)
Collecting wcwidth (from prompt-toolkit<3.1.0,>=3.0.41->ipython)
  Downloading wcwidth-0.2.13-py2.py3-none-any.whl.metadata (14 kB)
Collecting executing>=1.2.0 (from stack-data->ipython)
  Downloading executing-2.0.1-py2.py3-none-any.whl.metadata (9.0 kB)
Collecting asttokens>=2.1.0 (from stack-data->ipython)
  Downloading asttokens-2.4.1-py2.py3-none-any.whl.metadata (5.2 kB)
Collecting pure-eval (from stack-data->ipython)
  Downloading pure_eval-0.2.2-py3-none-any.whl.metadata (6.2 kB)
Collecting six>=1.12.0 (from asttokens>=2.1.0->stack-data->ipython)
  Downloading six-1.16.0-py2.py3-none-any.whl.metadata (1.8 kB)
Downloading ipython-8.25.0-py3-none-any.whl (817 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 817.3/817.3 kB 9.6 MB/s eta 0:00:00
Downloading numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.0 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.0/18.0 MB 35.6 MB/s eta 0:00:00
Downloading jedi-0.19.1-py2.py3-none-any.whl (1.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 56.9 MB/s eta 0:00:00
Downloading pexpect-4.9.0-py2.py3-none-any.whl (63 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.8/63.8 kB 4.9 MB/s eta 0:00:00
Downloading prompt_toolkit-3.0.46-py3-none-any.whl (386 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 386.3/386.3 kB 37.1 MB/s eta 0:00:00
Downloading pygments-2.18.0-py3-none-any.whl (1.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 89.0 MB/s eta 0:00:00
Downloading traitlets-5.14.3-py3-none-any.whl (85 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 85.4/85.4 kB 42.4 MB/s eta 0:00:00
Downloading decorator-5.1.1-py3-none-any.whl (9.1 kB)
Downloading matplotlib_inline-0.1.7-py3-none-any.whl (9.9 kB)
Downloading stack_data-0.6.3-py3-none-any.whl (24 kB)
Downloading asttokens-2.4.1-py2.py3-none-any.whl (27 kB)
Downloading executing-2.0.1-py2.py3-none-any.whl (24 kB)
Downloading parso-0.8.4-py2.py3-none-any.whl (103 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 103.7/103.7 kB 41.5 MB/s eta 0:00:00
Downloading ptyprocess-0.7.0-py2.py3-none-any.whl (13 kB)
Downloading pure_eval-0.2.2-py3-none-any.whl (11 kB)
Downloading wcwidth-0.2.13-py2.py3-none-any.whl (34 kB)
Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: wcwidth, pure-eval, ptyprocess, traitlets, six, pygments, prompt-toolkit, pexpect, parso, numpy, executing, decorator, matplotlib-inline, jedi, asttokens, stack-data, ipython
Successfully installed asttokens-2.4.1 decorator-5.1.1 executing-2.0.1 ipython-8.25.0 jedi-0.19.1 matplotlib-inline-0.1.7 numpy-1.26.4 parso-0.8.4 pexpect-4.9.0 prompt-toolkit-3.0.46 ptyprocess-0.7.0 pure-eval-0.2.2 pygments-2.18.0 six-1.16.0 stack-data-0.6.3 traitlets-5.14.3 wcwidth-0.2.13
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
--> 331e059f4ba
STEP 5/5: ENTRYPOINT ["ipython"]
COMMIT my-ipython-image
--> ad5a0455c49
Successfully tagged localhost/my-ipython-image:latest
ad5a0455c4973b617dd474ee44550aea3a06f82a5f24ac690f4eef6b05ee08c4

Let's try out the newly created image by running it.

podman run --rm -it my-ipython-image

Output

Python 3.12.3 (main, May 14 2024, 07:23:41) [GCC 12.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.25.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]:

We end up in an IPython shell allowing us to interact like in an IPython shell installed in the usual manner. Once we exit the shell, the container also stops running. Let's see how this works by disassembling the Dockerfile.

Disassembling the Dockerfile

The Dockerfile used above contains four different types of instructions:

  • FROM <image>
  • Sets the base image for the instructions below.
  • Each valid Dockerfile must start with a FROM instruction.
  • The image can be any valid image, e.g. from public registries. > Please note: Choose a trusted base image for your images. > We'll cover that topic in more detail in lesson 6 of this course.
  • LABEL <key>=<value> <key>=<value> <key>=<value> ...
  • The LABEL instruction adds metadata to the image.
  • A LABEL is a key-value pair.
  • This is typically used to provide information about e.g. the maintainer of an image.
  • RUN <command>
  • The RUN instruction executes any command on top of the current image. (We will cover this in a minute.)
  • The resulting image will be used as the base for the next step in the Dockerfile.
  • ENTRYPOINT ["executable", "param1", "param2"]
  • An ENTRYPOINT allows you to configure a container that runs as an executable.
  • Command line arguments to podman run <image> will be appended after all elements in the exec form ENTRYPOINT.

Example

podman run --rm -it my-ipython-image --version

Will give us the version number of IPython. This is equivalent to executing ipython --version, locally.

8.25.0

Let's build the image again and see what happens.

podman build -t my-ipython-image .

Output

STEP 1/5: FROM python:3.12
STEP 2/5: LABEL maintainer="support@hifis.net"
--> Using cache 8253e57ebbf57b190872932653f214e6dbb6838b71db239e3f85e8be6949bfb1
--> 8253e57ebbf
STEP 3/5: RUN pip install --upgrade pip
--> Using cache af32ccad5795506466538058a03a5397b424a71071a6e3fc1dcab1366658d465
--> af32ccad579
STEP 4/5: RUN pip install ipython numpy
--> Using cache 331e059f4baccf63ab37aeff54b21c6e8be541d518ae8a021a30a4ce94480e43
--> 331e059f4ba
STEP 5/5: ENTRYPOINT ["ipython"]
--> Using cache ad5a0455c4973b617dd474ee44550aea3a06f82a5f24ac690f4eef6b05ee08c4
COMMIT my-ipython-image
--> ad5a0455c49
Successfully tagged localhost/my-ipython-image:latest
ad5a0455c4973b617dd474ee44550aea3a06f82a5f24ac690f4eef6b05ee08c4

This time, the output is much shorter than in our initial run of the podman build command. In each of the steps it is claimed to have used the cache. As each instruction is executed, Podman looks for an existing image in its cache that has already been created in the same manner. If there is such an image, Podman will re-use that image instead of creating a duplicate. If you do not want Podman to use its cache, provide the --no-cache=true option to the podman build command.

Task: Create and Run a Data Science Image

Task Description

Your goal in this exercise is to create your own custom data science image as follows:

  1. Build your image on top of the latest Python image of release series 3.12.
  2. Mark yourself as the maintainer of the image.
  3. Install numpy, scipy, pandas, scikit-learn and jupyterlab using pip install.
  4. Create a custom user using the command useradd -ms /bin/bash jupyter.
  5. Tell the image to automatically start as the jupyter user and to use the working directory /home/jupyter.
  6. Make sure the image starts with the command jupyter lab --ip=0.0.0.0 by default.

Hint: Use the instructions USER and WORKDIR for task 5.

When having built the image, make sure to test it by running it and opening jupyter in your browser. You should be able to execute any command now, e.g.

import numpy as np
np.__config__.show()
Solution
  • Create a Dockerfile with below content.
FROM python:3.12

RUN pip install ipython jupyterlab numpy pandas scikit-learn

# Create a custom user under which the application runs
RUN useradd -ms /bin/bash jupyter

# Use this user by default for all subsequent operations
USER jupyter
# Default to start the container in the home directory of the jupyter user
WORKDIR /home/jupyter

# Publish port 8888 to the outside, for documentation purpose
EXPOSE 8888

ENTRYPOINT ["jupyter", "lab", "--ip=0.0.0.0"]
  • Build the image.
podman build -t my-datascience-image .
  • Run the image and bind port 8888.
podman run -p 8888:8888 -it --rm my-datascience-image

This yields an output as shown below. (Details may vary)

Output
[I 2024-06-06 09:25:31.939 ServerApp] jupyter_lsp | extension was successfully linked.
[I 2024-06-06 09:25:31.943 ServerApp] jupyter_server_terminals | extension was successfully linked.
[I 2024-06-06 09:25:31.947 ServerApp] jupyterlab | extension was successfully linked.
[I 2024-06-06 09:25:31.948 ServerApp] Writing Jupyter server cookie secret to /home/jupyter/.local/share/jupyter/runtime/jupyter_cookie_secret
[I 2024-06-06 09:25:32.219 ServerApp] notebook_shim | extension was successfully linked.
[I 2024-06-06 09:25:32.233 ServerApp] notebook_shim | extension was successfully loaded.
[I 2024-06-06 09:25:32.235 ServerApp] jupyter_lsp | extension was successfully loaded.
[I 2024-06-06 09:25:32.237 ServerApp] jupyter_server_terminals | extension was successfully loaded.
[I 2024-06-06 09:25:32.238 LabApp] JupyterLab extension loaded from /usr/local/lib/python3.12/site-packages/jupyterlab
[I 2024-06-06 09:25:32.238 LabApp] JupyterLab application directory is /usr/local/share/jupyter/lab
[I 2024-06-06 09:25:32.238 LabApp] Extension Manager is 'pypi'.
[I 2024-06-06 09:25:32.266 ServerApp] jupyterlab | extension was successfully loaded.
[I 2024-06-06 09:25:32.267 ServerApp] Serving notebooks from local directory: /home/jupyter
[I 2024-06-06 09:25:32.267 ServerApp] Jupyter Server 2.14.1 is running at:
[I 2024-06-06 09:25:32.267 ServerApp] http://7d573b7ed567:8888/lab?token=cce8ffcca09d774263fe1979b1f9c38527d5fd94a33b230c
[I 2024-06-06 09:25:32.267 ServerApp]     http://127.0.0.1:8888/lab?token=cce8ffcca09d774263fe1979b1f9c38527d5fd94a33b230c
[I 2024-06-06 09:25:32.267 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 2024-06-06 09:25:32.275 ServerApp] No web browser found: Error('could not locate runnable browser').
[C 2024-06-06 09:25:32.275 ServerApp] 

    To access the server, open this file in a browser:
        file:///home/jupyter/.local/share/jupyter/runtime/jpserver-1-open.html
    Or copy and paste one of these URLs:
        http://7d573b7ed567:8888/lab?token=cce8ffcca09d774263fe1979b1f9c38527d5fd94a33b230c
        http://127.0.0.1:8888/lab?token=cce8ffcca09d774263fe1979b1f9c38527d5fd94a33b230c
[I 2024-06-06 09:25:32.291 ServerApp] Skipped non-installed server(s): bash-language-server, dockerfile-language-server-nodejs, javascript-typescript-langserver, jedi-language-server, julia-language-server, pyright, python-language-server, python-lsp-server, r-languageserver, sql-language-server, texlab, typescript-language-server, unified-language-server, vscode-css-languageserver-bin, vscode-html-languageserver-bin, vscode-json-languageserver-bin, yaml-language-server