Solving the Constraints that Arise when using AWS Native CI/CD Tools

7 min readFeb 2, 2021

Overcoming CodePipeline’s Limitations Related to Feature Branch Builds

AWS offers a number of native development tools which can be used together to create a complete continuous integration, continuous delivery pipeline. These include:

CodeCommit — a source control service that hosts secure Git-based repositories.
CodeBuild — a fully-managed continuous integration service, and
CodePipeline — a fully managed continuous delivery service.

There are a number of benefits to using these AWS native development tools. They offer easy integration with AWS, the ability to have all of your development tools in the same place, and due to them being AWS native, they are designed to help you build software like Amazon.

There is, however, one major drawback to using these tools. This blog will outline this issue and also provide a step-by-step guide on how to overcome it.

The Problem

The issue with using the AWS native tools lies with CodePipeline. There is a requirement when creating a pipeline with CodePipeline to supply the branch name. This, of course, is not an issue for master branch builds, as the “master” branch name is static and unchanging. The issue arises when generating a CodePipeline for a feature branch, as feature branch names vary. This blog outlines a solution to this issue, which is made up of two parts:

Dynamically generating pipelines for feature branch builds upon creation of a new feature branch.
Automatically cleaning up these feature branch pipelines once the feature branch is deleted or merged to master *

* This becomes important to avoid having a large number of pipelines which are unused — belonging to feature branches which no longer exist.

The Solution

The solution is to execute a process which creates and deletes pipelines automatically. AWS Lambda offers the best serverless option in the AWS ecosystem as it can scale without limitation and has all the bells and whistles to trace and monitor function executions. The Lambda function is triggered on the creation, merging, or deletion of a feature branch in CodeCommit. The function then handles the trigger as required by either creating or deleting a pipeline for that feature branch in CodePipeline.

The overall infrastructure for this is fairly straightforward:

Prerequisites

There are a couple of quick things to get set up before we actually start writing our Lambda.

First, we need to find where we defined the CodeCommit repository that we are setting this trigger up for, and define a trigger for it as shown below:

resource “aws_codecommit_trigger” “repo_trigger” {
  repository_name = aws_codecommit_repository.repo.repository_name  trigger {
    name = “Repo Trigger”
    events = [“all”]
    destination_arn = <ARN of Lambda Function>
  }
}

Additionally, we need to ensure that CodeCommit has permission to invoke our Lambda function:

resource “aws_lambda_permission” “allow_codecommit_invocation” {
  statement_id = “AllowExecutionFromCodeCommit”
  action = “lambda:InvokeFunction”
  function_name = “${aws_lambda_function.<LAMBDA_FUNCTION>.function_name}”
  principal = “codecommit.amazonaws.com”
  source_arn = “arn:aws:codecommit:<REGION>:<ACCOUNT_NUMBER>:*”
}

For source_arn, I allowed all CodeCommit repos in the account, however, these permissions can be reduced to a single CodeCommit repo if required.

Lambda Function

Now we can write our Lambda Function. A step-by-step breakdown of the Lambda function can be found below.

Define imports and environment variables
Writing a function to create a pipeline
Writing a function to delete a pipeline
Putting it all together

1. Define imports and environment variables

First, we need to import some libraries, and set some environment variables.

import boto3
import logging
import osfrom botocore.exceptions import ClientErrorlogger = logging.getLogger()
logger.setLevel(logging.INFO)CODEPIPELINE_CLIENT = boto3.client(‘codepipeline’)ARTIFACT_BUCKET = os.environ[‘artifact_bucket_name’]
IAM_ROLE_ARN = os.environ[‘codepipeline_iam_role_arn’]
PROJECT_NAME = os.environ[‘codebuild_project_name’]

We have imported some basic libraries we’ll need to use. These include boto3, logging, and os. We also set up logging, and call the CodePipeline client.

ARTIFACT_BUCKET is the name of the S3 Bucket where artifacts are stored.
IAM_ROLE_ARN is the ARN of the IAM role that the generated CodePipeline will use.
PROJECT_NAME is the name of the CodeBuild project / projects which will comprise one or more steps in the generated CodePipeline.

2. Writing a function to create a pipeline

Next, we need to define a function to create a CodePipeline. This function will be called when a new feature branch is created. A CodePipeline is made up of a number of “stages”. The first “stage” is always the same, it pulls the source code from CodeCommit. The second (and last, in this case) “stage”, runs the CodeBuild project set by the PROJECT_NAME variable set above.

These two stages come together to define a CodePipeline as shown in the function below:

def create_codepipeline(branch_name, codecommit_repo_name, pipeline_name, project_name):
  try:
    CODEPIPELINE_CLIENT.create_pipeline(
      pipeline={
        ‘name’: <pipeline_name>,
        ‘roleArn’: <IAM_ROLE_ARN>,
        ‘artifactStore’: {
          ‘type’: ‘S3’,
          ‘location’: <ARTIFACT_BUCKET>,
        },        ‘stages’: [{
          ‘name’: ‘Source’,
          ‘actions’: [{
            ‘name’: ‘Source’,
            ‘actionTypeId’: {
              ‘category’: ‘Source’,
              ‘owner’: ‘AWS’,
              ‘provider’: ‘CodeCommit’,
              ‘version’: ‘1’
            },
            ‘outputArtifacts’: [{
              ‘name’: ‘SourceArtifact’
            }],
            ‘configuration’: {
              ‘RepositoryName’: <codecommit_repo_name>,
              ‘BranchName’: <branch_name>
            }
          }]
        }, {
          ‘name’: ‘Terraform_Plan’,
          ‘actions’: [{
            ‘name’: ‘Terraform_Plan’,
            ‘actionTypeId’: {
              ‘category’: ‘Build’,
              ‘owner’: ‘AWS’,
              ‘provider’: ‘CodeBuild’,
              ‘version’: ‘1’
            },
            ‘inputArtifacts’: [{
              ‘name’: ‘SourceArtifact’
            }],
            ‘configuration’: {
              ‘ProjectName’: <project_name>,
            }
          }]
        }]
      }
   )
 
  except ClientError as e:
    logger.error(“Error Creating Pipeline: %s” % e)

We run the first stage to pull the code from CodeCommit. We then have a second stage which is our “Terraform Plan” stage. The configuration for what to do during this stage is set by the “project_name” variable, which references a project we have already defined in CodeBuild.

A number of variables need to be passed to this function in order to create the CodePipeline for the feature branch. All of these variables (except for project_name) can be parsed from the event which triggers the Lambda, and the process to do so will be shown in step 4 “Putting it all together”.

3. Writing a function to delete a pipeline

We now need to define a function that will be triggered when a feature branch is merged to master or deleted. This function will delete the CodePipeline which was generated when the feature branch was created and is shown below:

def delete_codepipeline(pipeline_name):
  try:
    CODEPIPELINE_CLIENT.delete_pipeline(name=<pipeline_name>)
  except ClientError as e:
    logger.error(“Error Deleting Pipeline: %s” % e)

This function is very simple and straightforward. The name of the pipeline to delete needs to be passed to the function, which will then delete the pipeline.

4. Putting it all together

Now we need to put all of this together. We need to handle the event and decide which function we need to call. We also need to parse some required values from the event to pass to the required function. For example, the name of the feature branch, the name of the CodeCommit repo, etc.

def lambda_handler(event, context):
  record = event[‘Records’][0]
  event_name = record[‘eventName’]
  codecommit_repo_name = record[‘eventSourceARN’].split(‘:’)[-1]  reference = record[‘codecommit’][‘references’][0]
  if “tags” not in reference[‘ref’]:
    branch_name = reference[‘ref’].replace(‘refs/heads/’, ‘’)  if branch_name != “master”:
# Check if a Pipeline exits for this branch
    pipeline_exists = False
    pipelines = None# Here, I appreviate the repo name, you may need to change how this is done to match your naming convention
    
    repo_name_abbrv = (codecommit_repo_name.split('-')[-1]).capitalize()
    pipeline_name = (f"{repo_name_abbrv}_{branch_name}")
    pipeline_name = pipeline_name.replace('/', '_')

First, we parse the name of the CodeCommit repo in question from the Event which triggered the Lambda. We can also parse the name of the feature branch which has been created, merged, or deleted from the Event.

Next, we do a quick check to ensure the branch is not master. If it is, we have nothing to do, as a static CodePipeline already exists for the master branch. If it is not, however, then it is a feature branch that has triggered the lambda, and we will need to call one of our functions that we created above.

We now need to check if a pipeline already exists for this branch. We start with this set to “false” and with pipelines set to “None”. These values will be updated later.

We then need to define a naming convention for the name of the generated CodePipeline. It is important to have a naming convention so that we can check for existing pipelines. I have chosen to abbreviate the name of the CodeCommit repo, and append the branch name. But you can define this naming convention whichever way suits you. This variable is then set as pipeline_name.

Now, let’s check if the pipeline already exists:

try:
  pipelines = CODEPIPELINE_CLIENT.list_pipelines()
  except ClientError as e:
    logger.error(“Error Listing Pipelines: %s” % e)  if pipeline_name in [pipeline[‘name’] for pipeline in pipelines[‘pipelines’]]:
    logger.info(f”Pipeline `{pipeline_name}` exists”)
    pipeline_exists = True

If it does, set pipeline_exists to “True”.

Now we have a decision to make:

If the Event that triggered the Lambda is that a feature branch was merged or deleted, and a pipeline existed for that feature branch, let’s delete that CodePipeline.
If the Event that triggered the Lambda is that a feature branch was created, and a pipeline does NOT exist for that feature branch, let’s create that CodePipeline.

This is done as follows:

# If branch deleted and pipeline exists, delete pipeline
  if reference.get(‘deleted’, False):
    logger.info(f”Branch `{branch_name}` deleted”)
    if pipeline_exists:
      logger.info(f”Deleting Pipeline `{pipeline_name}`”)
      delete_codepipeline(pipeline_name)# If new commit on branch and no pipeline exists, create pipeline
  elif event_name in [‘ReferenceChanges’]:
    logger.info(f”New commit on branch `{branch_name}`”)
    if not pipeline_exists:
      logger.info(f”Creating Terraform Pipeline for `{branch_name}` called `{pipeline_name}`”)
      create_codepipeline(branch_name, codecommit_repo_name, pipeline_name, PROJECT_NAME)

And that’s it. You now have a Lambda function which is automatically triggered when a new feature branch is created, merged or deleted in CodeCommit, which then dynamically provisions or deletes CodePipelines as required.

The source code for this Lambda Function can be found here:

aine-boyle-work/codepipeline-feature-branch

This repo contains the source code for a Lambda function to dynamically create and delete CodePipelines for CodeCommit…

github.com