Continuous Integration

Yesterday’s post was a whirlwind tour of using Pytest, assert, and mocks to add unit tests to a Python program.

At the end of this process, you could run pytest on your laptop in order to run all the tests in all the test files.

You could also run pytest-watch to start a process that runs pytest automatically whenever a file changes.

This post continues that journey to run tests in the cloud, on a server that watches the repo for pushes. It’s like a team-level version of pytest-watch.

This is called Continuous Integration (CI), and the server that does this job is a Continuous Integration server (CI server)1.

Some advantages of CI include:

  • Even if you (or your team members) forget to run the units before you push, someone will.
  • The CI can annotate GitHub pull requests with information as to whether they pass the test suite. The more you put in the test suite, the less you have to manually test or review.
  • The output of CI (whether the tests pass or fail) can be used as the input to Continuous Delivery. We’ll cover this later.
  • You can configure a CI server to perform additional tasks. Some common uses are: checking code style and formatting (“linting”); computing test coverage; running and recording performance benchmarks; creating build artifacts (such as documentation, or downloadable executables); deploying builds that pass the test suite2.
  • A CI server keeps track of the state and artifacts of past builds. You can browse them via its dashboard, to see when a test started failing, or when a performance benchmark changed.

Some common CI servers include:

  • Travis. This is a hosted service that’s free for open source projects. It’s the market leader for open source projects.
  • CircleCI. Another hosted service. I’ve happily used this too.
  • Jenkins. The go-to open-source CI server if you want to host your own.
  • TeamCity. This is a commercial offering from JetBrains, which makes developer tools such as IntelliJ and PyCharm.
  • Bamboo. This is a commercial offering from Atlassian, which makes JIRA and recently acquired Trello.

Today we’ll use Travis.

Commit #c35067f contains all the code changes necessary to take our unit tests from yesterday, and running them on Travis in the cloud. It consists of:

  • A travis.yml file, copied from here.
  • A change to the README, to add a build status badge3. This is optional.

The build status badge displays a different image depending on whether the latest build succeeded. You can see it in action on the home page of the repo. It also links to the Travis build dashboard for this project.

The only other thing necessary to set up Travis as a CI server to sign into Travis, and add your GitHub repo. Travis (and the other hosted CI servers) installs a repository webhook, to let it know when a repository is pushed to.

A later post will show how to use this to set up Continuous Delivery (CD).

One more thing: The Skillz repo demonstrates unit testing and CI for a JavaScript (Nodejs back end, Reactjs front end) project. This is considerably more complicated for two reasons that have to do with the distinction between Python and JavaScript: (1) this repo, for reasons I’ll explain later, contains two different subprojects – the front end and the back end – each with their own package dependencies and tests; and (2) the code in this repo requires a version of PostgreSQL that Travis doesn’t support.

References

Addenda

The first version of this essay referred to CD as “Continuous Deployment”. I’ve changed it to “Continuous Delivery”. This article describes the differences.

  1. There’s a distinction between unit tests, that test a single function or class, and integration tests, that validate that several units of a program work together. CI is called continuous integration because a team that does work in separate branches would run unit tests on their laptops, but might not discover integration problems until a larger, less frequent build that integrates the work from different branches. This isn’t a distinction you need to understand in order to use CI, and it also isn’t a distinction that’s true with many of the team development workflows in use today. 

  2. This is Continuous Delivery (CD). In this use case, the CI server is also managing CD. 

  3. You’ve probably seen these badges on other repos. This is the first of several badges we’ll run across over the course of the semester.