Versioning is important. I do not have to tell you this. Yet, I see it done poorly over and over again.
The deficiencies I usually encounter are a lack of unambiguous developer and release processes and poor tooling support. If your developers sit there wondering how to do certain tasks, then the process is broken. They should know how to release a new version or how to hotfix production while the trunk has moved on. And your CI/CD processes should support these scenarios too!
Many rely on their CI/CD tool to determine the next version of their application. I am guilty of doing this too but at some point, I realised this is a decision that only the developer can make. Our tools are not yet smart enough to look at code changes and tell me if it's a feature, a fix or something else altogether.
This gave me the burning desire to change the way we do things. My goals were to have an unambiguous process that covers all those use cases, is developer-friendly and is supported in the variety of CI/CD tools we tend to use across projects.
This post chronicles this approach.
Semantic Versioning
We are going to use Semantic Versioning (semver) here. This is possibly the most prominent versioning scheme used in software today.
Most of the applications and libraries we build tend to expose an API, be it a REST API, an interface, etc. Semver is all about versioning this API.
The syntax is universally known:
<MAJOR>.<MINOR>.<PATCH>
MAJOR | Introduce a new backward-incompatible change | 1.0.0 → 2.0.0 |
MINOR | Introduce a new backward-compatible change | 1.0.0 → 1.1.0 |
PATCH | Fix a bug while maintaining backward-compatibility | 1.0.0 → 1.0.1 |
As the table indicates, backward-compatibility is a big differentiator when it comes to version bumps.
Some people still get this wrong so here I am going to use a JSON REST API to illustrate what constitutes a patch, a feature or a breaking change.
Let's assume a competent developer has implemented this REST API and it successfully follows Postel's law:
Major |
|
Minor |
|
Patch |
|
Conventional Commits
As good as tools are these days, they cannot identify the nature of a code change yet. The day will come where through Machine Learning, this will be possible but for now, we have to rely on good old-fashioned human intelligence.
Conventional commits provide the mechanism to communicate the nature of changes in a commit between the developers and the CI/CD tools.
In a nutshell, the developer provides a commit message that unambiguously identifies the nature of the change. Then a CI/CD tool can scan all the commit messages since the last version and determine how to bump the version.
In addition to this automation, this approach provides clear communication of changes to other team members and even let us automatically generate release notes and changelogs.
The official summary and examples for Conventional Commits are concise enough that there is no point for me to repeat them here. Go ahead and have a look. I will wait for you here.
Here are some sample commits for our JSON REST API:
Major |
|
Minor |
|
Patch |
|
N/A |
|
If your team is a bit more "fun", you can always give these alternate signals a shot: ✨ (feat), 🐛 (fix), 📚 (docs), 💎 (style), ♻️ (refactor), 🚀 (perf), ✅ (test), 📦 (build), 👷 (ci), 🔧 (chore).
What are you building?
If you have spent any time in software development, you already know that people who advocate for "one size fits all" need to be shown the door. So, in this post, I would like to examine two vastly different approaches to software release and how this technique can be applied to both.
Description | Benefits | Encountered in |
Release Every Merge |
||
|
|
|
Pre-release then Release |
||
|
|
|
I have seen many variations of these two approaches in the wild so what we will discuss here should be as applicable to those as well. Though I should mention that I tend to avoid styles that break the version immutability, such as SNAPSHOT versions in Maven or @next distribution channels in NPM.
Pull Requests
Regardless of how you do your releases, I am hoping you are introducing new features via pull requests (sometimes called merge requests). If you are not, we have bigger problems than just versioning.
Each pull request should ideally contain a single feature or fix. As part of the pull request review, the developer may need to commit more changes to address review comments. But these additional commits are not new features or fixes to code on the trunk.
My solution is to use a squash merge strategy. This way, the developers can do whatever they like with their commit messages on the feature/fix branch. Those commits will all disappear and the developer can provide a conventional commit message for the entire pull request at the point of merge.
Most decent Git repositories also let you use your pull request name as your squash commit message. This is nice if you like to see consistent pull request names from your developers and let reviewers to even review the squash commit message before approving.
Branching Strategy for Release-Every-Merge
Regardless of the release style, I tend to lean towards trunk-based (mainline) branching. I avoid Gitflow because I care enough about my developers so as not to make them spend everyday resolving merge conflicts. Not to mention, rebuilding the same version of an application just because you merged from develop to master flies in the face of "build once, deploy many times" CI/CD practice.
Now that I have reached my per-post quota for ranting about Gitflow, let's talk about how we do branching when we release on every merge.
This is quite straightforward: create a feature/fix branch and follow the pull request process above.
As a CI/CD process designer, one of your primary goals should be: whatever developers do most often should be easiest to do. I feel the above meets this criterion.
The less common scenarios are not that much more difficult neither. Let's say the developers are building a new major version of the application but there has been a production defect for the previous major version that needs to be hot-fixed. This is the playbook to do this hotfix:
- Find out the minor version in production. Let's say v1.3.
- Create v1.3.x branch from the latest patch version for that minor version, e.g. v1.3.6.
- Create a fix branch and pull request the fix back into v1.3.x
- Build, deploy and promote to production from v1.3.x branch
- Port the fix to mainline by merging the v1.3.x branch into master
Branching Strategy for PreRelease-then-Release
In this method, you introduce the breaking changes on a pre-release branch. You can use whatever zany name you like for these but I will stick to the traditional alpha/beta terminology.
Essentially, you work on one or more pre-release branches until you are ready to release the new version to your consumers; at which point, you simply merge the pre-release branch into the trunk.
This diagram demonstrates this by releasing 2 alpha and 1 beta versions prior to the next canonical version bump.
Above diagram also demonstrate a hotfix to production during all this pre-release work.
It goes without saying, you can have fewer or more pre-release branches and even merge back-and-forth between them as you desire. It is really up to your personal release style.
Implementation in Tools
Let's have a look at how our approach to versioning and branching can be applied to various source code management and CI/CD platforms.
While you can leave it up to the developers to observe the Conventional Commits conventions, that would require a monk-like level of discipline that I hardly see in our profession. So it is much wiser to enforce the commit message format on the relevant branches. Most SCMs provide this feature as a server-side hook.
For some, realising their commit messages are incorrect on the server is too late. Fortunately, there are tools, such as Commitizen, commitlint or even a simple pre-commit that can warn users as early as possible.
Though if you follow our squash merge strategy, you are absolved from caring about your local commit messages.
We, however, are not absolved from caring about the commits on the long-living branches. So, it is always recommended to ensure the build process validates the commit history on those branches since the last build to ensure all commit messages are compliant.
Typically, we have at least the following in place in our CI/CD processes:
- On each pull request build, validate all commit messages since the last build to ensure they follow conventional commits
- On a trunk build, in addition to usual testing:
- Validate the commit messages
- Determine the next version by analysing commit messages and previous tags
- Tag the current commit with the new version
- Optionally generate release notes
- Publish the artifacts
Tagging the commits to identify it as the source for an application version is a valuable practice. It makes the Git repository a self-contained source-of-truth and removes some of the over-reliance on the CI/CD tool. More than once I have seen teams having to restart their versioning when they were using CI/CD as the source-of-truth but then they had to migrate to a different tool or there was an irrecoverable failure of the tool.
Now that we know what needs to be done, we need to figure out how to do it. This seems like a lot of functionality to implement. Fortunately, there are tools out there that already do all of this for us. My current preferred tool is semantic-release, which provides all of the above and more.
semantic-release
semantic-release provides a command-line interface (CLI) that can be invoked by any CI/CD tool. The prerequisites are NodeJS and Git.
One of the strengths of semantic-release is that it can be extended using plugins. The official plugins let you create release notes and changelog files, publish releases to GitHub and Gitlab, publish packages to NPM and APM, etc. There are plenty of community plugins as well.
I will use a GitLab to illustrate how to configure and use this tool but this approach can as easily be translated to other CI/CD tools.
I am going to use GitLab's Docker executor to run my builds in containers so first, I need to create a Docker image that has semantic-release and all the plugins I need.
FROM node:14.3.0
LABEL maintainer="sohrab"
RUN npm install --global \
semantic-release@17.0.8 \
@semantic-release/exec@5.0.0 \
@semantic-release/gitlab@6.0.4
(Yes, I version-pin everything. Like a pro. "Repeatable, reliable builds" is another CI/CD principle. If you are re-using this, please check npmjs.com for the latest versions.)
Next, I need to configure semantic-release for my repository. There are a few ways to do this but here I drop .releaserc.json file at the root of my repository with the following content:
{
"plugins": [
"@semantic-release/commit-analyzer",
"@semantic-release/release-notes-generator",
"@semantic-release/gitlab",
[
"@semantic-release/exec", {
"successCmd": "echo \"VERSION=${nextRelease.version}\" >> vars.env"
}
]
]
}
This will enable only the plugins that I want to use, in this case:
- commit-analyzer determines the next version by analysing the commit history of the repo
- release-notes-generator generates release notes in the conventional-changelog format
- gitlab publishes the release notes as a GitLab Release
- exec writes the release version to a dot-env file so it is available in the subsequent stages
Finally, we need to configure the CI/CD itself. Here it is shown in GitLab YAML. Even if you have never used GitLab before, this should be fairly self-explanatory and translatable into other CI/CD tools:
stages:
- version
- build
version:
image: semantic-release
script:
- semantic-release
artifacts:
reports:
dotenv: vars.env
rules:
- if: '$CI_MERGE_REQUEST_TARGET_BRANCH_NAME == "master"'
when: on_success
build:
image: ...
script:
# run all tests, build, package and publish the artifact
- ...
rules:
- if: '$VERSION'
when: on_success
The VERSION environment variable in vars.env file, produced by the version stage, can then be used by the build stage to version the artifact. It is worthwhile to note that we skip build and publish if no new version has been produced.
The last bit of configuration is to ensure that semantic-release can push tags into your repository. For this, you need to provide the tool with the Git authentication details. In my use case, it is a matter of setting GITLAB_TOKEN environment variable.
Tip: If you are using a self-hosted Gitlab instance, you need to also configure GITLAB_URL to point to your instance. This is not required if you are using gitlab.com.
I should note that in case of a build failure, the version stage should not be run again since it has already tagged the commit with the version. So the above sequence is suitable if your CI/CD tool lets you resume failed pipelines from the build stage. If this support doesn't exist, then you need to either:
- manually clean-up the tags before re-running the pipeline, or
- change the pipeline to perform a semantic-release with --dry-run flag to get the new version, run the build and finally run semantic-release for real.
That's it!
We have used this approach, especially the release-every-merge style, on projects with relative success.
I have to be honest with you, if you don't have good tooling, you are going to need good developer discipline. If you have neither, then this may not be for you. But if you can use these techniques, then you will never have to give versioning much thought past what you are delivering in your commits and pull requests.