Automatic semantic versioning in CI/CD pipelines

30 November, 2023 |  Vladimir Djurovic 
img/automatic-semantic-versioning.jpg

Automatic semantic versioning referes to a practice where you apply versions to software artifacts automatically in CI/CD pipelines. This practice greatly helps with software deployment velocity and reduces possibility of human error.

Automatic semantic versioning is based on Git, semantic versioning and Conventional commits.

Table Of Contents

Why do we need to automate artifact versioning?

In the “ancient” times of software development, releases were performed once in a few weeks or months. Usually, someone would manually run a script or click a button while entering some data, and release would be done. However, this process is error prone and susceptible to human error.

Let’s look at Maven Release plugin as an example of release process. To release new version with Maven Release plugin, you can simply run mvn release:prepare release:perform and the plugin takes care of the rest. By default, it will take a version specified in pom.xml file, which is expected to be a snapshot, and release it as a stable version. So, version 1.2.3-SNAPSHOT will be released as 1.2.3.

What if you want to release a different version? Well, you can specify it as a property or run the command interactively. For example, you can run mvn -Dversion=1.3.0 release:prepare release:perform.

Can you spot the problem here? In all of these approaches, you need to know the release version in advance, so you can specify it. When doing it manually from time to time, it might not be a big problem. But, when you run automated deployment pipeline and release multiple times per day, you need a way to automate things.

What is semantic versioning?

In a nutshell, semantic versioning is a practice of versioning software in format major.minor.micro numbers. Each of these parts designates a semantic change in a software:

  • micro - sometimes also called a patch number is a minor change that does not affect backward compatibility. This could be a bug fix, changes in test cases, documentation update and similar.
  • minor - designates a significant changes in software that does not break backward compatibility. For example, when you add new feature, but don’t break existing clients.
  • major - major change in software that might break backward compatibility. Example could be changing underlying file format, so it is no longer compatible with existing files.

Semantic versioning specification allows more flexibility in defining versions, such as adding build number, pre-release designations etc. But, in a vast majority of cases, major.minor.micro format is sufficient. That is why we will focus on it in this post.

How does Git fit in?

In this process, Git has a central part as persistence mechanism for versions. Every time a new version is released, pipeline should create new tag with that version. This ensure two main benefits:

  • we have a history of releases, and a snapshot of each release’s content
  • latest tag can be use as a source to determine current version of software

Automate semantic versioning with git and conventional commits

So, we’ve decided to use semantic versioning for our software. Now, the question is, how do we know which part of version string to bump when we release new version? Conventional commits to the rescue!

Conventional commits is a practice devised in Angular project. It describes a way of formatting Git commit messages in a way that allows users to infer what is the intention of the change. It could be new feature, a bug fix, code maintenance task or something else.

Commit messages confoming to convention are basically formatted like this:

feat: some descriptive message

This basically says that commit represents a new feature and it’s described with the rest of the message. Now, I won’t go into details of conventional commits specification, but the essence is that we can use the type of the change to infer which part of the semantic versioning string should be bumped. We can use the following policy:

  • micro number - bumped for the following types: fix, build, chore, ci, docs, style, refactor, perf, test
  • minor number - bumped for the feat type
  • major number - bumped for the BREAKING CHANGE type

Armed with this knowledge, we can now automate semantice versioning in our CI/CD pipeline.

Bash scripts to automate semantic versioning

In order to implement this automation, we will create two Bash scripts:

  • semver-bump.sh - this script will bump requested number in semantic version string
  • version.sh - it will infer which number to bump, based on commit message

Bumping semantic version number

Let’s start with the semver-bump script. The source code is shown bellow:

# Parse command line options.

while getopts ":Mmp" Option
do
  case $Option in
    M ) major=true;;
    m ) minor=true;;
    p ) patch=true;;
  esac
done

shift $(($OPTIND - 1))

version=$1

# Build array from version string.

a=( ${version//./ } )

# If version string is missing or has the wrong number of members, show usage message.

if [ ${#a[@]} -ne 3 ]
then
  echo "usage: $(basename $0) [-Mmp] major.minor.patch"
  exit 1
fi

# Increment version numbers as requested.

if [ ! -z $major ]
then
  ((a[0]++))
  a[1]=0
  a[2]=0
fi

if [ ! -z $minor ]
then
  ((a[1]++))
  a[2]=0
fi

if [ ! -z $patch ]
then
  ((a[2]++))
fi

echo "${a[0]}.${a[1]}.${a[2]}"

This script takes an argument in the form of semantic version string and an option which part to bump. Valid options are:

  • p - bump patch version
  • m - bump minor version
  • M - bump major version

For example, it can be invoked as:

./semver-bump.sh -m 1.2.3
1.3.0

The script has bumped minor version, as requested, resulting in next version to be 1.3.0

Infer current version and which part to bump

The next script we will create is version.sh. It’s job has 3 parts:

  • determine current version using Git tags
  • parse the commit message and infer type of the change using conventional commits approach
  • invoke semver-bump script to calculate new version

Script content is the following:

PATCH_REGEX='^(build|chore|ci|docs|fix|perf|refactor|revert|style|test)\s?(\(.+\))?\s?:\s*(.+)'
MINOR_REGEX='^(feat)\s*(\(.+\))?\s?:\s*(.+)'
MAJOR_REGEX='^(BREAKING CHANGE)\s*(\(.+\))?\s?:\s*(.+)'

# get the latest tag
LATEST_TAG=$(git describe --tags `git rev-list --tags --max-count=1`  2> /dev/null)
if [ -z $LATEST_TAG ]; then
  LATEST_TAG="v1.0.0"
  echo "1.0.0"
  exit 0
fi

LATEST_VERSION=""
if [[ $LATEST_TAG =~ [0-9]+\.[0-9]+\.[0-9]+$ ]]; then
    LATEST_VERSION=${BASH_REMATCH[0]}
else
    echo "Failed to extract current version" >&2
    exit 1
fi

SCRIPT_DIR=$( dirname -- "$0"; )
# process last commit message
MESSAGE=$(git log -1 --pretty=%B)
if [[ "$MESSAGE" =~ $PATCH_REGEX ]]; then
  "$SCRIPT_DIR"/semver-bump.sh -p $LATEST_VERSION
elif [[ "$MESSAGE" =~ $MINOR_REGEX ]]; then
  "$SCRIPT_DIR"/semver-bump.sh -m $LATEST_VERSION
elif [[ $MESSAGE =~ $MAJOR_REGEX ]]; then
  "$SCRIPT_DIR"/semver-bump.sh -M $LATEST_VERSION
else
  "$SCRIPT_DIR"/semver-bump.sh -p $LATEST_VERSION
fi

In this script, we first define few regular expressions that help us parse conventional commit message. Then we find the latest tag from Git repository (Note: for this command to work, it must be run inside Git repository). If this is a Git repository that contains no tags, we assume that there are no previous releases and return the initial version of 1.0.0. This can be altered to be any version you want, but 1.0.0 is a good starting point.

Second part of the script attempts to extract latest tag’s semantic version and store it in LATEST_VERSION variable. If version can not be determined, return an error.

The last part of the script tries to parse the last commit message and match it against conventional commit type. When a match is found, script invokes semver-bump script to invoke correct part of the semver string. In case of commit message not conforming to conventional commits format, patch number will be bumped by default.

See it in action

Armed with these two scripts, we can now use it in real life scenario. If we go back to our Maven Release plugin example, we’ve seen that we can specify new version as a command line property. We can now infer intended new version with this simple approach:

1
2
 export VERSION=$(./version.sh)
 mvn -Dversion=$VERSION release:prepare ...

This will invoke Maven Release plugin with requested version based on curent Git tags and commit message.

In another post, this approach is used to develop custom Github Action for automatic versioning .

Pros and cons of automatic semantic versioning

Let’s go over some good and bad sides of this approach to versioning. We’ll start with the good ones:

  • Full automation - this is (relatively) fire-and-forget approach. Once setup, you no longer need to worry about it
  • Standard conformance - using specifications like Semantic versioning and Conventional Commits hooks your software development cycle to widely adopted industry standards

The bad sides:

  • Requires dicipline - all team members must adopt Conventional Commits format for their commit messages, otherwise there is no point in doing it. This must be enforced on organizational level
  • Another moving part in the pipeline - although this is not something complicated, it is another part of your software development platform that you need to worry about.

Conclusion

I hope this was a good read. I tried to convey my experience with this kind of problem and a possible solution. I would love to hear some feedback and thoughts on this subject.

Please feel free to comment on any of this in comments section bellow.