Skip to content

Creating a Container

When starting to work with containers you will soon notice that existing images may not always satisfy your needs. In these situations you want to create your own custom image.

Images are defined by a text file called Dockerfile. Dockerfiles contain the instructions for Docker / Podman how to create a custom image as the basis for containers.

Let's build and run our first image

We start by creating a text file called Dockerfile in the folder ~/using-containers-in-science/.

$ cd ~
$ mkdir using-containers-in-science
$ cd using-containers-in-science
$ nano Dockerfile

Now, we add the content below into the Dockerfile:

FROM python:3.11
LABEL maintainer="support@hifis.net"

RUN pip install --upgrade pip
RUN pip install ipython numpy

ENTRYPOINT ["ipython"]

After that we can save and leave the editor (In the case of nano: Ctrl+O then Ctrl+X). Congratulations, it is that simple. The image can be built using the podman build command as shown below.

Note that to build a custom image, you have to be in the folder containing the Dockerfile. The latter is implicitly used as the input for the build, and you have to specify the name of the image to be built.

$ podman build -t my-ipython-image .
Which should yield something along the line of the following output. (Details may vary.)
STEP 1/5: FROM python:3.11
Resolved "python" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Trying to pull docker.io/library/python:3.11...
Getting image source signatures
Copying blob 0f546edb7ae0 done
Copying blob 9f13f5a53d11 done
Copying blob ad4c837a72f8 done
Copying blob 012c0b3e998c done
Copying blob e13e76ad6279 done
Copying blob 00046d1e755e done
Copying blob e2f116097408 done
Copying blob a0d3c67a6b6b done
Copying config 22c957c35e done
Writing manifest to image destination
Storing signatures
STEP 2/5: LABEL maintainer="support@hifis.net"
--> ef2f8a305e5
STEP 3/5: RUN pip install --upgrade pip
Requirement already satisfied: pip in /usr/local/lib/python3.11/site-packages (23.2.1)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
--> bb4d6685387
STEP 4/5: RUN pip install ipython numpy
Collecting ipython
  Obtaining dependency information for ipython from https://files.pythonhosted.org/packages/7f/d0/c3eb7b17b013da59925aed7b2e7c55f8f1c9209249316812fe8cb758b337/ipython-8.15.0-py3-none-any.whl.metadata
  Downloading ipython-8.15.0-py3-none-any.whl.metadata (5.9 kB)
Collecting numpy
  Obtaining dependency information for numpy from https://files.pythonhosted.org/packages/32/6a/65dbc57a89078af9ff8bfcd4c0761a50172d90192eaeb1b6f56e5fbf1c3d/numpy-1.25.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Downloading numpy-1.25.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.6 kB)
Collecting backcall (from ipython)
  Downloading backcall-0.2.0-py2.py3-none-any.whl (11 kB)
Collecting decorator (from ipython)
  Downloading decorator-5.1.1-py3-none-any.whl (9.1 kB)
Collecting jedi>=0.16 (from ipython)
  Obtaining dependency information for jedi>=0.16 from https://files.pythonhosted.org/packages/8e/46/7e3ae3aa2dcfcffc5138c6cef5448523218658411c84a2000bf75c8d3ec1/jedi-0.19.0-py2.py3-none-any.whl.metadata
  Downloading jedi-0.19.0-py2.py3-none-any.whl.metadata (22 kB)
Collecting matplotlib-inline (from ipython)
  Downloading matplotlib_inline-0.1.6-py3-none-any.whl (9.4 kB)
Collecting pickleshare (from ipython)
  Downloading pickleshare-0.7.5-py2.py3-none-any.whl (6.9 kB)
Collecting prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30 (from ipython)
  Obtaining dependency information for prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30 from https://files.pythonhosted.org/packages/a9/b4/ba77c84edf499877317225d7b7bc047a81f7c2eed9628eeb6bab0ac2e6c9/prompt_toolkit-3.0.39-py3-none-any.whl.metadata
  Downloading prompt_toolkit-3.0.39-py3-none-any.whl.metadata (6.4 kB)
Collecting pygments>=2.4.0 (from ipython)
  Obtaining dependency information for pygments>=2.4.0 from https://files.pythonhosted.org/packages/43/88/29adf0b44ba6ac85045e63734ae0997d3c58d8b1a91c914d240828d0d73d/Pygments-2.16.1-py3-none-any.whl.metadata
  Downloading Pygments-2.16.1-py3-none-any.whl.metadata (2.5 kB)
Collecting stack-data (from ipython)
  Downloading stack_data-0.6.2-py3-none-any.whl (24 kB)
Collecting traitlets>=5 (from ipython)
  Downloading traitlets-5.9.0-py3-none-any.whl (117 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 117.4/117.4 kB 4.7 MB/s eta 0:00:00
Collecting pexpect>4.3 (from ipython)
  Downloading pexpect-4.8.0-py2.py3-none-any.whl (59 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.0/59.0 kB 19.6 MB/s eta 0:00:00
Collecting parso<0.9.0,>=0.8.3 (from jedi>=0.16->ipython)
  Downloading parso-0.8.3-py2.py3-none-any.whl (100 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.8/100.8 kB 29.8 MB/s eta 0:00:00
Collecting ptyprocess>=0.5 (from pexpect>4.3->ipython)
  Downloading ptyprocess-0.7.0-py2.py3-none-any.whl (13 kB)
Collecting wcwidth (from prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30->ipython)
  Downloading wcwidth-0.2.6-py2.py3-none-any.whl (29 kB)
Collecting executing>=1.2.0 (from stack-data->ipython)
  Downloading executing-1.2.0-py2.py3-none-any.whl (24 kB)
Collecting asttokens>=2.1.0 (from stack-data->ipython)
  Obtaining dependency information for asttokens>=2.1.0 from https://files.pythonhosted.org/packages/4f/25/adda9979586d9606300415c89ad0e4c5b53d72b92d2747a3c634701a6a02/asttokens-2.4.0-py2.py3-none-any.whl.metadata
  Downloading asttokens-2.4.0-py2.py3-none-any.whl.metadata (4.9 kB)
Collecting pure-eval (from stack-data->ipython)
  Downloading pure_eval-0.2.2-py3-none-any.whl (11 kB)
Collecting six>=1.12.0 (from asttokens>=2.1.0->stack-data->ipython)
  Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Downloading ipython-8.15.0-py3-none-any.whl (806 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 806.6/806.6 kB 15.9 MB/s eta 0:00:00
Downloading numpy-1.25.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 63.8 MB/s eta 0:00:00
Downloading jedi-0.19.0-py2.py3-none-any.whl (1.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 94.7 MB/s eta 0:00:00
Downloading prompt_toolkit-3.0.39-py3-none-any.whl (385 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 385.2/385.2 kB 73.3 MB/s eta 0:00:00
Downloading Pygments-2.16.1-py3-none-any.whl (1.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 50.4 MB/s eta 0:00:00
Downloading asttokens-2.4.0-py2.py3-none-any.whl (27 kB)
Installing collected packages: wcwidth, pure-eval, ptyprocess, pickleshare, executing, backcall, traitlets, six, pygments, prompt-toolkit, pexpect, parso, numpy, decorator, matplotlib-inline, jedi, asttokens, stack-data, ipython
Successfully installed asttokens-2.4.0 backcall-0.2.0 decorator-5.1.1 executing-1.2.0 ipython-8.15.0 jedi-0.19.0 matplotlib-inline-0.1.6 numpy-1.25.2 parso-0.8.3 pexpect-4.8.0 pickleshare-0.7.5 prompt-toolkit-3.0.39 ptyprocess-0.7.0 pure-eval-0.2.2 pygments-2.16.1 six-1.16.0 stack-data-0.6.2 traitlets-5.9.0 wcwidth-0.2.6
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
--> 452dad7a319
STEP 5/5: ENTRYPOINT ["ipython"]
COMMIT my-ipython-image
--> 87b766243c5
Successfully tagged localhost/my-ipython-image:latest
87b766243c522002808224482e88c56ea641010a45e453c9352833fd716295f2

Let's try out the newly created image by running it.

$ podman run --rm -it my-ipython-image
Python 3.11.5 (main, Sep  7 2023, 12:36:05) [GCC 12.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.15.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]:

We end up in an IPython shell allowing us to interact like in an IPython shell installed in the usual manner. Once we exit the shell, the container also stops running. Let's see how this works by disassembling the Dockerfile.

Disassembling the Dockerfile

The Dockerfile used above contains four different types of instructions:

  • FROM <image>
  • Sets the base image for the instructions below.
  • Each valid Dockerfile must start with a FROM instruction.
  • The image can be any valid image, e.g. from public registries. > Please note: Choose a trusted base image for your images. > We'll cover that topic in more detail in lesson 6 of this course.
  • LABEL <key>=<value> <key>=<value> <key>=<value> ...
  • The LABEL instruction adds metadata to the image.
  • A LABEL is a key-value pair.
  • This is typically used to provide information about e.g. the maintainer of an image.
  • RUN <command>
  • The RUN instruction executes any command on top of the current image. (We will cover this in a minute.)
  • The resulting image will be used as the base for the next step in the Dockerfile.
  • ENTRYPOINT ["executable", "param1", "param2"]
  • An ENTRYPOINT allows you to configure a container that runs as an executable.
  • Command line arguments to podman run <image> will be appended after all elements in the exec form ENTRYPOINT.

Example

$ podman run --rm -it my-ipython-image --version

Will give us the version number of IPython. This is equivalent to executing ipython --version, locally.

8.15.0

Let's build the image again and see what happens.

$ podman build -t my-ipython-image .
STEP 1/5: FROM python:3.11
STEP 2/5: LABEL maintainer="support@hifis.net"
--> Using cache ef2f8a305e52bb4699945fda5343cfb9ddefac6d8c5449d491402cc3d2d68039
--> ef2f8a305e5
STEP 3/5: RUN pip install --upgrade pip
--> Using cache bb4d66853874d4a7bca86be9e78eedb8d49b6838205c81010d87775ef4624193
--> bb4d6685387
STEP 4/5: RUN pip install ipython numpy
--> Using cache 452dad7a3196f8de151536e367f0b3a5f2cd42e46543349d5e829849069c2437
--> 452dad7a319
STEP 5/5: ENTRYPOINT ["ipython"]
--> Using cache 87b766243c522002808224482e88c56ea641010a45e453c9352833fd716295f2
COMMIT my-ipython-image
--> 87b766243c5
Successfully tagged localhost/my-ipython-image:latest
87b766243c522002808224482e88c56ea641010a45e453c9352833fd716295f2

This time, the output is much shorter than in our initial run of the podman build command. In each of the steps it is claimed to have used the cache. As each instruction is executed, Podman looks for an existing image in its cache that has already been created in the same manner. If there is such an image, Podman will re-use that image instead of creating a duplicate. If you do not want Podman to use its cache, provide the --no-cache=true option to the podman build command.

Task: Create and Run a data science image

Task Description

Your goal in this exercise is to create your own custom data science image as follows:

  1. Build your image on top of the latest Python image of release series 3.11.
  2. Mark yourself as the maintainer of the image.
  3. Install numpy, scipy, pandas, scikit-learn and jupyterlab using pip install.
  4. Create a custom user using the command useradd -ms /bin/bash jupyter.
  5. Tell the image to automatically start as the jupyter user and to use the working directory /home/jupyter.
  6. Make sure the image starts with the command jupyter lab --ip=0.0.0.0 by default.

Hint: Use the instructions USER and WORKDIR for task 5.

When having built the image, make sure to test it by running it and opening jupyter in your browser. You should be able to execute any command now, e.g.

import numpy as np
np.__config__.show()
Solution
  • Create a Dockerfile with below content.
FROM python:3.11

RUN pip install ipython jupyterlab numpy pandas scikit-learn

# Create a custom user under which the application runs
RUN useradd -ms /bin/bash jupyter

# Use this user by default for all subsequent operations
USER jupyter
# Default to start the container in the home directory of the jupyter user
WORKDIR /home/jupyter

# Publish port 8888 to the outside, for documentation purpose
EXPOSE 8888

ENTRYPOINT ["jupyter", "lab", "--ip=0.0.0.0"]
  • Build the image.

    $ podman build -t my-datascience-image .
    
  • Run the image and bind port 8888.

    $ podman run -p 8888:8888 -it --rm my-datascience-image
    

This yields an output as shown below. (Details may vary)

Console Output
[I 2023-09-08 07:02:40.713 ServerApp] Package jupyterlab took 0.0000s to import
[I 2023-09-08 07:02:40.726 ServerApp] Package jupyter_lsp took 0.0132s to import
[W 2023-09-08 07:02:40.726 ServerApp] A `_jupyter_server_extension_points` function was not found in jupyter_lsp. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
[I 2023-09-08 07:02:40.734 ServerApp] Package jupyter_server_terminals took 0.0068s to import
[I 2023-09-08 07:02:40.734 ServerApp] Package notebook_shim took 0.0000s to import
[W 2023-09-08 07:02:40.734 ServerApp] A `_jupyter_server_extension_points` function was not found in notebook_shim. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
[I 2023-09-08 07:02:40.735 ServerApp] jupyter_lsp | extension was successfully linked.
[I 2023-09-08 07:02:40.738 ServerApp] jupyter_server_terminals | extension was successfully linked.
[I 2023-09-08 07:02:40.741 ServerApp] jupyterlab | extension was successfully linked.
[I 2023-09-08 07:02:40.742 ServerApp] Writing Jupyter server cookie secret to /home/jupyter/.local/share/jupyter/runtime/jupyter_cookie_secret
[I 2023-09-08 07:02:41.023 ServerApp] notebook_shim | extension was successfully linked.
[I 2023-09-08 07:02:41.040 ServerApp] notebook_shim | extension was successfully loaded.
[I 2023-09-08 07:02:41.042 ServerApp] jupyter_lsp | extension was successfully loaded.
[I 2023-09-08 07:02:41.043 ServerApp] jupyter_server_terminals | extension was successfully loaded.
[I 2023-09-08 07:02:41.044 LabApp] JupyterLab extension loaded from /usr/local/lib/python3.11/site-packages/jupyterlab
[I 2023-09-08 07:02:41.044 LabApp] JupyterLab application directory is /usr/local/share/jupyter/lab
[I 2023-09-08 07:02:41.044 LabApp] Extension Manager is 'pypi'.
[I 2023-09-08 07:02:41.046 ServerApp] jupyterlab | extension was successfully loaded.
[I 2023-09-08 07:02:41.046 ServerApp] Serving notebooks from local directory: /home/jupyter
[I 2023-09-08 07:02:41.046 ServerApp] Jupyter Server 2.7.3 is running at:
[I 2023-09-08 07:02:41.046 ServerApp] http://baaf2ac171c5:8888/lab?token=...
[I 2023-09-08 07:02:41.046 ServerApp]     http://127.0.0.1:8888/lab?token=...
[I 2023-09-08 07:02:41.047 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 2023-09-08 07:02:41.054 ServerApp] No web browser found: Error('could not locate runnable browser').
[C 2023-09-08 07:02:41.054 ServerApp]

To access the server, open this file in a browser:
    file:///home/jupyter/.local/share/jupyter/runtime/jpserver-1-open.html
Or copy and paste one of these URLs:
    http://baaf2ac171c5:8888/lab?token=...
    http://127.0.0.1:8888/lab?token=...

[I 2023-09-08 07:02:41.067 ServerApp] Skipped non-installed server(s): bash-language-server, Dockerfile-language-server-nodejs, javascript-typescript-langserver, jedi-language-server, julia-language-server, pyright, python-language-server, python-lsp-server, r-languageserver, sql-language-server, texlab, typescript-language-server, unified-language-server, vscode-css-languageserver-bin, vscode-html-languageserver-bin, vscode-json-languageserver-bin, yaml-language-server