Effortless Python CI: Jumpstart Your Project With This Template

Improve the quality of your code with this simple yet effective CI template !
python
continuous integration
github template
Author
Published

Feb 21, 2023

Starting a new Python project ? 👀
Stop right here !
Try this Continuous Integration (CI) template to catapult your code quality to a stellar level 🚀

Overview

Want to make sure your tests are passing ?
Want to minimize the hustle of having to work as a team on a large scale project ?
This article is for you !

It will provide the basic knowledge you need to get started using a CI in Python as well as understanding the concepts behind it.
You have something cooking on the stove and have no time to read ?
Or you’re just tired to re-create your CI over and over each time you start a project ?

Good news ! I’ve built a GitHub template that will make your life so much easier.
Follow the instructions and you’re good to go in a few minutes.

By the way, if you’re interested to steamline your machine learning workflow, feel free to explore my MLflow GitHub Template.

Continuous Integration (CI)

Working in a team can be cumbersome sometimes.
And it can be very difficult to make sure all members are aligned applying the multiple best practices there is in Python.
Luckily, a Continuous Integration (CI) pipeline can enforce code standards.

As GitLab explains it very well:
“Continuous integration is the practice of integrating all your code changes into the main branch of a shared source code repository early and often, automatically testing each change when you commit or merge them, and automatically kicking off a build. With continuous integration, errors and security issues can be identified and fixed more easily, and much earlier in the development process.”

In other words, a Continuous Integration (CI) pipeline will trigger (on a event like push or merge) a series of checks to ensure that the code meets certain quality criteria. If the tests are successful, the contributor is allow to push/merge. Otherwise, his contributions are rejected until the CI passes.

Git Branching Strategy

For this article, we will assume that our team is using the Feature Branching strategy:

  • Create a branch remote out of the main branch to implement a new feature.
  • Push contribution to the remote branch.
  • Once the feature is finished, create a Pull Request (PR) to merge on main.
  • Merge main on your remote, fix the conflicts and push on the remote branch.
  • The CI will trigger and the PR can be resolved only if the tests pass.

The repository template is designed for this workflow. Feel free to modify it as you see fit.

For more information on Git Branching Strategies read this article.

Follow along

If you already know all the CI tools that we are going to use, you can just create a new repository from the template and follow the instruction on the Github page.

On the contrary, to understand all the steps that we will include in our CI, follow along. Click on the green button Use this template > Create a new repository and follow the instruction to create a repository.

Clone the project and cd at the root of the repository and follow the installation section on GitHub.

Python Code Quality Tools

In this section, we will review the pillars of Continuous Integrate that will ease the merging process and enhance the quality of our code.

Testing

It is terrible to realize that after adding a shiny feature to your product, you broke something that was working quite fine on the previous version. 😭
That is why automatic testing is a mandatory practice to avoid regressing during development.

PyTest is a simple yet effective framework for building a test suite for your code.
Let’s get started by activating your Python environment and install PyTest:

workon myenv
pip install pytest

At the root of the repository, the mymodule directory contains the example module. While using this template, you will remove this module in the future to replace it by your own code.
The mymodule/helloword.py file contains a hello world function and the mymodule/tests/test_helloworld.py file implements a simple test for this function.

When running pytest, the script will recursively search for all the functions starting with test_*, execute each one of them and print a report of the tests that failed and passed. To write your own test is trivial, write new function with name test_* and use the assert statement to test your code.

Linting

Linting is checking your source code for programmatic of stylistic errors.
It is done using a Lint tool, also known as Linter.
I’ve decided to use PyLint as it enforces more restriction than Flake8.

To run PyLint, execute:

pylint mymodule

As you can see, there is no PyLint error in the repository template.
While programming new features for your project your can setup your IDE to spot linting errors on the fly so that you can fix them before running the CI.

Formatting

Every one experienced the pain of not having coding style rules in a collaborative project.
It gets ugly 😣

To make sure that every contributor is following the same coding style, it is good practice to use a code formatter. A code formatter will format the code based on a set of rules defined by the coding style it follows.

We will be using Black over the other formatter. I found it pretty good at enforcing a consistent coding style while giving yourself enough freedom.

To format your code, run:

black mymodule

In the CI we will use the --check --verbose options to investigate if Black would have made any changes to our code. If it is the case, it means that the contributor didn’t run Black before merging the PR.

Security and Vulnerability

When developing a product, it is important to minimize any security issues as it will make your application vulnerable to hackers.

Bandit is a Python tool allowing you to scan your code for any potential security vulnerability. To test it, run:

bandit -r .

Bandit will display a security test report to guide you to fix the potential issues.
Same as before, if the contributor has security issues spotted by Bandit in his code, the PR will be locked until Bandit identifies no issues.

Type Hints Checking

Type hinting is an optional solution introduced in Python3.5 to add additional typing information to the code. Let’s consider the following example:

# Without type hints
def print_str(text, repeat=1):
    print(text * repeat)

# With type hints
def print_str(text: str, repeat: int=1) -> None:
    print(text * repeat)

These functions produce the same result but the second provides extra information about the arguments as well as the return type. Using type hinting has become the new standard for producing quality Python code as it eases the process of understanding.

We can go event further by using a tool such a Mypy.
Mypy is a optional static typing checker for Python. It will check if the type hints are consistent with their usage through the code. Using such a tool allows the user to detect potential bugs or inconsistency during development.

To use Mypy, run:

mypy mymodule/

Same logic here, it will provide a report that will display the potential typing errors.

Running the tools in VSCode

You are probably thinking:
“Will I have to run all these commands each time I need to commit 🤕 ?”
Don’t worry 😎 ! If you are using VSCode, the integration of these tools is seamless.

VSCode will automatically recognize the .vscode/settings.py at the root of the template.
It will run all the tools out of the box upon saving a file.
The options to enable them are:

"python.linting.pylintEnabled": true,
"python.linting.enabled": true,
"python.formatting.provider": "black",
"python.languageServer": "Pylance",
"python.linting.mypyEnabled": true,
"python.linting.mypyArgs": [
    "--disallow-untyped-defs",
    "--ignore-missing-imports",
    "--disallow-untyped-defs",
    "--no-implicit-optional"
]

Mypy can be cumbersome to use with its default arguments so we added extra options to make it less strict. More information about these option here.

Github Actions CI

So far so good, we now understand each tool we use and are able to run them in our IDE.
What about the CI now ? 🫡

The template uses Github Actions, a CI/CD tool that enables to easily automate the testing process. The CI’s configuration can be found here: .github/workflows/main-ci.yml.
If you need to modify the behavior of the CI, review the Github Actions Documentation.

Project Configuration

Unfortunately, the repository templating feature of Github does not allow to propagate policies to the newly create repository. Go to your project main page, then Settings > Branches > Add branch protection rule.

Set Branch name pattern to main. Tick the following boxes:

  • ✅ Require a pull request before merging
    • ❎ Require approvals
    • ✅ Dismiss stale pull request approvals when new commits are pushed
  • ✅ Require status checks to pass before merging
    • ✅ Require branches to be up to date before merging
  • ✅ Do not allow bypassing the above settings

If you are the only maintainer on the project untick Require approvals. In the search box under Require branches to be up to date before merging add the check-code-quality job.

Make sure you hit the Save changes button ! 🤣

This ensure that no one is able to push right to the main branch. The PRs will not be merge until the CI passes.

Continuous Integration Workflow

To trigger the CI manually go to Actions > main-ci > Run workflow.

Now let’s trigger the CI by adding a feature to our brand new project.
We will follow the workflow discussed in the Git Branching Strategy section.
Go at the root of your project and create a new branch:

git checkout -b feature_branch
git push --set-upstream origin feature_branch

Rename the mymodule package with the name you want to use for the core package of your repository. Let’s assume we rename it newmodule for the examples.

Edit the packages argument in setup.py file to include newmodule:

setup(
    name="New module"
    version="0.0.1",
    author="My first and fast name",
    author_email="myemail@gmail.com",
    description="New description",
    license="BSD",
    packages=['newmodule'],
    long_description=read('README.md'),
    classifiers=[
        "Development Status :: 3 - Alpha",
        "Topic :: Utilities",
        "License :: OSI Approved :: BSD License",
    ],
)

In the newmodule/tests/test_helloworld.py change the import accordingly:

#from mymodule.helloword import helloword
from newmodule.helloword import helloword

You can then add any functions and tests in the package as you’d like.
Now add your changes, commit and push:

git add -u
git add newmodule/\*.py
git commit -m "change package name"
git push

Now we need to create a Pull Request to merge our code back to the main branch.
Go to Pull requests > New pull request > compare: feature_branch > Create pull request > Create pull request.

You should see the following checks trigger:

See the details of the CI run, go to Actions and click on the latest run under All workflows. You should see a graph of the jobs in the CI:

If all jobs have a green icon, you’re good to go 🎯
Otherwise, if one of the jobs have a red icon, click on the name and you will see a log of the failed job. Fix the error in your code and push the changes. The CI should run again.

Finally, you can hit the Merge pull request button to have your changes to the main branch. Good job 🥇 !

Wait, where are you going ? 👀 We’re not done yet. Clean up by removing your branch if necessary:

git checkout main
git pull

git branch -D feature_branch
git push --delete origin feature_branch

Conclusion

Congratulations ! 🍾
You learned the whole process of using a CI pipeline and the associated Feature Branch workflow.

An effective workflow is the key to success in developing large scale collaborative projects.
I hope this article will help you along your programming journey.

Remember to hit the Use this template button when starting a new Python project.
Do not hesitate to reach out if you have any questions or suggestions 🙃


Stay in touch

I hope you enjoyed this article as much as I enjoyed writing it!
Feel free to support my work by interacting with me on LinkedIn 👀

Subscribe to get the latest articles from my blog delivered straight to your inbox!


About the author

Axel Mendoza

Senior MLOps Engineer

I'm a Senior MLOps Engineer with 5+ years of experience in building end-to-end Machine Learning products. From my industry experience, I write long-form articles on MLOps to help you build real-world AI systems.