Extending the Pipeline¶
Gems and Jewels to Collect¶
At the end of this episode you will have a CI pipeline that encompasses a few common CI use cases that you could also apply for your CI pipelines in your own projects. Additional GitLab CI keywords will be explained, such as:
- Conditional execution of CI jobs with
- create, store and access artifacts with
- reuse artifacts created in previous CI jobs with
In this episode you will extend the CI pipeline we elaborated in the last episode while explaining the following CI use cases we introduced previously:
- Checking the license compliance,
- checking the code style of the project,
- testing against multiple Python versions.
We also dive deeper into the keyword
stages and introduce new keywords
dependencies and a list of selected
predefined GitLab CI variables.
Additional CI Use Cases to Extend the CI Pipeline¶
Before we approach the topic of optimizing the CI pipeline a few further very common CI use cases are missing in our CI pipeline.
Checking the License Compliance¶
We will develop a CI job that checks that all files contain license and
copyright information and that all license texts of the licenses used are
contained in the project.
First, we need to tell GitLab CI to run the CI job in a particular stage like
lint that you need to declare at the beginning in your YAML file:
In the context of checking the license compliance the command of the
Since we are working with Python‘s virtual environments we need to prefix
the command with
poetry run so that reuse is executed in that virtual
Now, we are ready to write down the corresponding CI job:
In our final
.gitlab-ci.yml file the complete job may look like this:
Checking the Code Style of the Project¶
Code style checking (or linting) should also always be part of your coding
projects and can be done automatically in CI pipelines.
are recommandable tools to do that in the Python universe.
The respective commands are then
black --check --diff . and
isort --check --diff ..
The first approach would be to copy and paste the previous lint job and
exchange the tasks in the
Our second lint job can then be added to the CI pipeline:
As you can see, because of our copy and paste approach we introduce quite a bit of duplications. We will adapt the CI pipeline and reduce some duplications again in later episodes.
Testing Against Multiple Python Versions¶
Testing is the most important task that needs to be automated in CI pipelines.
Your test suite ensures that you do not break anything if you push your
changes to the repository.
This safety net is essential for coding projects to reduce the risk of having
defects in your code.
is a unit-test framework for Python projects.
You may execute your test suite with the command
On top, you can create CI jobs each testing your application with different
versions of the Python interpreter.
But first, we need an additional stage called
test to run the test suite:
Now, you can duplicate a previous job, assign the jobs to stage
image keyword accordingly:
The full jobs in all detail look like this in our example:
test:python:3.8: image: python:3.8 stage: test before_script: - pip install --upgrade pip - pip install poetry - poetry install script: - poetry run pytest tests/ test:python:3.9: image: python:3.9 stage: test before_script: - pip install --upgrade pip - pip install poetry - poetry install script: - poetry run pytest tests/ test:python:3.10: image: python:3.10 stage: test before_script: - pip install --upgrade pip - pip install poetry - poetry install script: - poetry run pytest tests/
Again, this introduces quite a bit of repetitions which we tackle in follow-up episodes.
Additional Concepts and GitLab CI Keywords¶
In this section we would like to discuss more concepts and keywords that you may want to use in your projects.
More About Stages and Jobs¶
Now that we created our first complete CI pipeline covering all of our CI
use cases, let us inspect our CI pipeline and the three stages and six CI jobs
We observed that those stages are executed in sequence, i.e. jobs of later
stages run only if the previous stage completed successfully.
Those testing jobs in the
test stage run in parallel, though.
This is possible because all jobs in stage
test are independent of each other.
We recommend running jobs in parallel in a stage if the independence criterion
holds true, because parallelization speeds up the pipeline significantly.
In later episodes we will learn how to change this default behaviour with the
needs keyword and change the running order of CI jobs.
Also, we will further speed up the CI pipeline with some additional concepts.
Predefined Variables in GitLab CI¶
Predefined variables in GitLab CI are variables in the context of GitLab CI which have useful values assigned. They can be used in GitLab CI pipelines.
Predefined Variables Reference¶
This is a compilation of few selected CI variables:
||The commit branch name. Available in branch pipelines, including pipelines for the default branch. Not available in merge request pipelines or tag pipelines.|
||The branch or tag name for which project is built.|
||The commit revision the project is built for.|
||The commit tag name. Available only in pipelines for tags.|
||The name of the project’s default branch.|
||The authentication password of the GitLab Deploy Token, if the project has one.|
||The authentication username of the GitLab Deploy Token, if the project has one.|
||A token to authenticate with certain API endpoints. The token is valid as long as the job is running.|
||The full path the repository is cloned to, and where the job runs from.|
||The address of the project’s Container Registry. Only available if the Container Registry is enabled for the project.|
||The password to push containers to the project’s GitLab Container Registry. Only available if the Container Registry is enabled for the project. This password value is the same as the CI_JOB_TOKEN and is valid only as long as the job is running. Use the CI_DEPLOY_PASSWORD for long-lived access to the registry|
||The username to push containers to the project’s GitLab Container Registry. Only available if the Container Registry is enabled for the project.|
||The address of the GitLab Container Registry. Only available if the Container Registry is enabled for the project. This variable includes a :port value if one is specified in the registry configuration.|
||The URL to clone the Git repository.|
Predefined Variables for Merge Request Pipelines¶
On top, this is a compilation of few selected CI variables that are present in merge request pipelines only:
||The source branch name of the merge request.|
||The HEAD SHA of the source branch of the merge request. The variable is empty in merge request pipelines. The SHA is present only in merged results pipelines.|
||The target branch name of the merge request.|
||The HEAD SHA of the target branch of the merge request. The variable is empty in merge request pipelines. The SHA is present only in merged results pipelines.|
In order to show how these predefined variables can be used inside your CI pipeline, we give this example that just outputs the values of two predefined CI variables that we need in the next section of this episode:
This is the output appearing in the CI job log of job
Conditional Execution of CI Jobs With
It might be the case that you do not need to execute a CI job in all pipeline
runs but in pipelines that fulfil certain conditions.
A useful keyword is the
when it comes to executing CI jobs conditionally.
The keyword is quite powerful but in our opinion also a bit harder to
Here we introduce the most common rule, i.e. execute a job if the pipeline
has been triggered due to a merge into branch main.
Taken the run job of our pipeline this looks like this:
As a consequence, the
run job which created a new set of plots is only
executed if the branch at hand which we commit into during a merge
is the default branch, i.e. branch
main in our case.
$CI_COMMIT_BRANCH holds the branch name which we commit into during
$CI_DEFAULT_BRANCH holds the default branch name, i.e.
in this project.
Running this job only conditionally might be reasonable because we only want
to generate plots originating from default branch
Create, Store and Access Artifacts With
You might have asked yourself whether we could access artifacts generated
during a CI job.
Fortunately, this is possible with the
We need to specify the artifacts retained from a CI job as a list of files
and directories like this:
After the job completed the plots are stored for a period of 30 days as job artifacts. In case of so called latest artifacts they won’t be deleted until newer artifacts arrive. You can access them and, for example, download them by navigating into the CI job log of your CI job and click download in the job artifacts section on the right side-bar.
Reuse Artifacts Created in Previous CI Jobs With
What if we have generated some artifacts in a previous CI job, do we need to
re-generate the artifacts already created in a later CI job if we need them?
No, of course it is possible to pass artifacts from one job on to a later
The respective keyword is the
You can tell the CI pipeline to fetch the job artifacts of a previous CI job:
stages: - run - deploy run: image: python:3.9 stage: run before_script: - pip install --upgrade pip - pip install poetry - poetry install script: - poetry run python -m astronaut_analysis rules: - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH artifacts: paths: - results/ pages: stage: deploy script: - mkdir public/ - cp results/age_histogram.png public/age_histogram.png rules: - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH artifacts: paths: - public/ dependencies: - run
pages CI job
running on changes on branch
main needs some explanations.
In GitLab you can host internal static web pages containing files such as
There is a special CI job called
pages that deploys your static web page to
During a pipeline run you need to copy your generated page into the
folder and name it in the artifacts section of the CI job
pages job will then take all contained files and hosts them as a static
web page, if this feature is activated in the settings of your GitLab project.
To activate GitLab Pages you can navigate to
Settings > General > Visibility, Project Features, Permissions
and enable the Pages feature.
After the first pipeline run you can find the URL of your static web page
in the settings of the project:
Settings > Pages.
All logged in GitLab users can access these Pages then.
It is also possible to make these Pages private and accessible by project
Exercise 1: Create a Complete CI Pipeline for the Exercise Project¶
By now we have introduced some keywords and concepts that are useful in covering all CI use cases discussed so far. In the following exercise you should try to develop a CI pipeline for the exercise project which includes all CI use-cases from the previous exercise. These were:
- Check license compliance.
- Linting the source code.
- Building the executable.
- Run existing test cases.
- Run the executable.
The pipeline might contain jobs like
To get you started, these are the relevant commands for the
of the CI jobs:
- License compliance can be checked by the before-mentioned
- Linting can be done by a tool called
cpplint --recursive src/ tests/
- The build of the application is done with
cmake -S . -B build and
cmake --build build
- The test suite can be run by
cd build && ctest
- Finally, we want to run the application on the command-line without any
Take Home Messages
In this episode we explored some additional common CI use cases like linting
and testing and introduced new GitLab CI keywords like
dependencies and listed a few predefined GitLab CI variables.
Next, we will take the CI pipeline we wrote so far and optimize and polish it a bit so that it is easier to read, much easier to maintain and runs more efficiently and faster.