August 2, 2021

Apps + Coffee Connection: Elevate your releases with progressive delivery

This blog is part of a series that delivers insights from the monthly meetups of Bugsnag’s Apps + Coffee Connection (ACC) community.

In June, ACC members gathered online for a roundtable discussion about progressive delivery. Moderated by James Smith, Bugsnag founder and SVP of Products, the conversation focused on how to elevate your releases with progressive delivery—a topic that Bugsnag will be highlighting at the Game Developers Conference (GDC) this July. 

What exactly is progressive delivery? 

The hot new term “progressive delivery” is used to encompass multiple approaches to turning on and off code delivered to customers. If the term sounds familiar, that’s probably because backend teams have been using it for a long time with approaches such as blue-green deployments. Traditionally, progressive delivery has also been used to explain strategies like alpha testing, beta testing, and “dogfooding.” 

These days, excitement around progressive delivery is attached to some newer techniques, including feature flagging, toggling, experimentation, and A/B testing. These approaches are popular because they enable engineers to turn on or off code paths that exist in the application at run time, based on whatever rules and techniques are applied. 

If you’re uncertain whether you’re using progressive delivery, ask yourself: Do all customers, or only a subset of your customers, get all of the code and features you release? If it’s a subset, then you’re using progressive delivery.

Innovate? Yes. Change processes? Pass. 

Do you know what’s considered the hardest part about introducing progressive delivery to an organization? Getting buy-in from engineers. (Remarkably, this exact sentiment was expressed at the May ACC when discussing the introduction of new team structures like pods!) As one participant explained, “Even though software engineers like to innovate with software, they are very conservative with processes. They like to stick with what works.” 

If you want to benefit from progressive delivery but are up against engineers who are focused on potential downsides, it’s important to emphasize the long-term positive impacts. Progressive delivery is widely viewed as a great way to ship code faster while using controls that minimize risk to the broader customer experience. 

However, clarity must be built into the process. Participants shared some sage advice for adopting progressive delivery techniques, including: 

  1. Have a lot of conversations with engineering, product, and the broader team to prepare for the impact of adopting progressive delivery, including a potential slowdown on releasing features in the short term; 
  2. Take it slow and ease into new processes to make sure they work for everyone; and 
  3. Define clear rules and guidelines to keep teams and individuals aligned on progressive delivery and what outcomes to expect. 

Who controls progressive delivery?

Not surprisingly, every organization has a slightly different way of doing things in order to match company dynamics and culture. Answers vary for who’s in control of running experiments or setting feature flags, but it’s important to have clear ownership.

One company’s approach is to have everything customer-facing owned by the product team. As this participant explained, “At our company, product teams are 100% owners of feature flags and decide which customers receive a feature, what percentage of customers and when, and whether to turn something off if complaints start to come in. Engineers only have access to feature flags for technical emergencies.”

To help keep things running smoothly, products like LaunchDarkly and Split are now being used for the permission models in their dashboards. Because some companies jumped into progressive delivery before these types of off-the-shelf solutions were available, homegrown solutions are also common.

The double-edged customer sword 

The intent behind experiments and A/B testing is to measure performance of new code or features on a subset of customers. Naturally, the question arises: If you tell customers that they are part of an experiment or A/B testing group, can you receive honest feedback? 

As with everything, techniques differ, as do the pros and cons. Some participants advocate for keeping customers in the dark in order to receive unbiased reactions. As one stated, “In order to get fair statistics on usage, it’s better not to tell customers and make them use it. People might not use a new feature if they know it’s being tested and might have a short shelf life.”

Others have had success with hand-picking a selection of beta testers who provide feedback to help shape the product. As one participant explained, “We use beta testers when we know we need to do something, but we are not sure what’s the best way. We talk to testers and say we want to launch and test this feature, and we want you to be involved.” 

How do you measure success? 

Two main categories of metrics should be considered for measuring the health and success of a phased rollout. The first is the expected outcome: Is it doing what we thought it would do? This product metric speaks to whether the new feature increases or decreases the metric you care about. 

The second category is around reliability and asks the question: Did it break anything? This shared metric can point to one or more parts of reliability, which encompasses performance, availability, and stability.

As one participant stated, “We always define success metrics with the product and performance metrics, so we pay attention to whether we get complaints and whether it has a negative impact on our system performance.”

Number of active experiments is company-dependent 

Progressive delivery raises one final question that participants tackled: How many experiments is too many experiments? 

And, as you may have guessed by now, the answers vary. Some companies commit to only conducting one A/B test at a time and never do more than two variants (only A or B) because it’s too hard to measure success otherwise. Smaller organizations tend to stick to one or several experiments at a time, often including different customers in each experiment.

As one participant explained, “We decided a long time ago that we never do three variants, such as A, B, and C. If we have more than two variants, it’s much harder to measure. In fact, we try to never do more than one experiment at a time because that can also skew the results.”

Alternatively, large companies like Pinterest are renowned for running hundreds of experiments on their customer base at one time, and the same customer may be included in multiple concurrent experiments. The latter is possible with a high volume of customers and a data science team.

Curious to learn more about the ACC community? Register here to join our vibrant community and monthly discussions.

BugSnag helps you prioritize and fix software bugs while improving your application stability
Request a demo