In today’s dynamic software development world, delivering a seamless user experience while ensuring software security can be a daunting task. Amidst the pressure to rapidly release new features, developers need strategies that not only ensure functional accuracy but also provide a security blanket against unforeseen vulnerabilities.
Canary deployments have emerged as a powerful tool for this dual purpose. Offering a controlled way to release software updates to a subset of users, it provides developers with the ability to test the waters and address any issues before a full-scale release. This article delves into the nuances of canary deployments and highlights their significance in security testing.
What is a Canary Deployment?
Canary deployment is a release risk mitigation strategy for software developers. It allows developers to limit the harm caused by releasing a buggy or unstable software update, and to quickly and safely roll back an update without compromising the entire software project.
Canary deployment allows developers to incrementally update or roll out software functionality. It then monitors the new versions and analyzes end-user data to determine whether to fully deploy or roll back features. A canary version of the software contains all the necessary application code and dependencies and is used to test new features and upgrades and evaluate how they work in production environments.
Canary deployments are an alternative to blue-green deployments, in which you maintain two environments in parallel, the live environment (blue) and the new version of the software (green), and then switch over traffic from blue to green.
Why Is Canary Testing Effective?
Canary releases are a great way to make incremental changes to your code, when those changes involve adding new features or modifying existing features. Because the code is released to real users in production, the development team can quickly evaluate whether changes are working as expected.
Canary deployments also allow developers to migrate small groups of users to new features in new releases. Exposing only a fraction of the overall user base to new code allows developers to roll back buggy versions to their entire user base while minimizing the impact of potential issues with new software. This ensures the impact of bugs or stability issues is much lower.
Canary testing makes it easy to identify the impact of changes to existing applications. You can closely monitor the performance of your code before releasing it to a large user base. Because canaries are only deployed to a small number of users, the risk of general performance degradation and poor user experience is greatly reduced. Additionally, if changes are found to degrade application performance, contain bugs, or generate negative user feedback, you can immediately reverse them.
Canary testing is a great way to select real users and set up a beta program to gather valuable feedback before the main code release. Sometimes the QA team is the first group to test new features. You can have them test the application in the same environment as your end users and find bugs in production that might not be detected or identified during staging.
Canary Deployment and Security Testing
Security testing is the process of evaluating whether hardware, software, networks, or any other IT resource meets security requirements and is sufficiently protected against cyber attacks. Security testing can provide evidence that a system is really secure, or conversely, discover security weaknesses and provide actionable recommendations for remediating them.
When DevOps teams refactor systems to improve their security, they need to test them in a realistic production environment to confirm they are secure and identify the attack surface. However, this raises a risk, because if security measures are improperly implemented, production systems could be attacked. Canary deployments offer a compromise—instead of deploying a security update to all users, it can be released to only a small subset of users.
This means the chances of an attacker seeing the change and exploiting it is relatively small. At the same time, it makes it possible to conduct security testing and even a full penetration test, to confirm that security changes to the system are effective.
4 Phases of Canary Testing
The process for canary testing and development is simple and has only three stages:
Plan
Planning is essential in the first stage of canary testing. The goal is to identify the expected output of the test. Knowing what to look for can help you figure out how to deploy canaries to test new features.
You need to pay attention to key performance indicators (KPIs) that can indicate success or failure of the release. Some examples of what to monitor include CPU and memory utilization, latency, and internal error counts.
Another important consideration is thresholds. In order to canary-test a new feature, you need to identify a random subset of users. Do you want to route canaries to 5% or 10% of your user base?
Deploy
After you define how the canary test should segment your user base, you’ll need to deploy the canary to a staging server. You will need to prepare the following:
- Deployment checklist
- Configuration files and build artifacts for each canary
- Testing scripts to route traffic to canaries
You then need to create canary nodes through a process called load balancing. Clone your production environment, creating an infrastructure similar to an already-active software environment. One of the clones is the original or baseline. If the new code doesn’t work, you’ll roll back to this clone. There should be at least two versions of the production application, but you can clone more, depending on how many features you want to test.
During the planning phase, you should also set the boundaries for the test period. Canary tests typically run from minutes to hours, so close monitoring is essential.
Analyze
Having routed the code to the selected user base, traffic is now sent to the baseline and canary test nodes. During this phase, the team tests the new version. Collect data for the metrics you specified in the first step. Use these metrics to ensure that the canary is running and check its health. Obtain data on latency, memory usage, number of errors, and capacity. Logs can provide you details on bottlenecks or bugs encountered by users.
Roll
The information provided by monitoring can help you make an informed decision on what to do next. When issues are discovered, monitoring will give you information that can help the team fix the issues. Once you get used to it, you can easily consider rolling out a release across criteria or try different tests with a different subset of users.
There are a few options for proceeding:
- Release—the test was successful and you can now release the code from your infrastructure.
- Increase threshold—the test was successful, but you need more information, so you can run another canary test and expose the new version to a higher percentage of users.
- Roll back—the test fails and you decide to roll back or revert to a previous version. It is essential to fix the issues discovered before running any other tests.
Conclusion
Canary deployments have revolutionized the way developers approach software releases, adding an extra layer of security and assurance. By rolling out updates to a select group of users, developers can get real-time feedback, making it easier to catch and rectify bugs or vulnerabilities.
The phased approach of planning, deploying, analyzing, and deciding on the rollout ensures that the software reaches the wider audience in its best form. As cyber threats become increasingly sophisticated, adopting strategies like canary deployments for security testing will become indispensable for organizations aiming for both functional excellence and robust security.