How stability targets keep your apps on the mark

James Smith

Should applications be released when we don’t know how stable the user experience is? The obvious answer is the right one: of course not.

Application stability is the most important metric that software companies can use. Yet many organizations don’t measure stability at all. Or, if they track errors, this information isn’t used by engineering and product organizations to make informed development decisions.

Stability is the metric everyone should know and use

Today, many DevOps teams track “error rates,” which is the inverse of stability. And that’s great! It’s helpful to know what your error rates are and how often your apps are failing.

However, this data is often not visible to other parts of the organization, which is why other teams may view it as something outside their purview. Nothing could be further from the truth.

Stability is a metric that everyone—and I do mean everyone—should care about. From top to bottom, stability should be discussed on a regular basis and used as a common language between teams to answer the all-important question: do we build new features or fix bugs?

In short, stability bridges the gap between product and engineering teams, but it also consolidates business decisions across all teams in the organization. Win-win.

If your goal is to release high-quality apps that customers will use and enjoy (obviously), then it’s time to add stability scores and targets to your arsenal. Here’s how.

Step 1: Address the missing link in your metric stack.

To measure stability, you must collect and analyze data. If you can’t use your data, there’s no point in having it. That means your first step is to ensure measurement tools are in place. By adopting a stability management and error monitoring tool like Bugsnag, you can metricize and analyze your stability with ease.

In case you’re wondering, there is precedent for measuring stability. You already talk about uptime and availability of applications, right? Stability follows the same logic. Think about the metric stack in the context of building a new car:

  • Does my car have wheels? (Is my infrastructure available?)
  • Do I have any flat tires? (Is my application working?)
  • Is my car fast? (Is my application performant?)

Stability answers the second question and demonstrates where the rubber hits the road. Quite frankly, none of these questions represents something that only DevOps should care about. Developers care. Product owners care. And we all know customers will care if their tire blows out during their first test run.

Step 2: Align behavior and decisions with stability targets.

With a stability monitoring tool in place, you can easily see your application’s stability scores, which are calculated using real-time error rates and sessions data. These scores give you the percentage of successful app interactions in each release.

Then the fun begins. You’re in the driver’s seat with tools like Bugsnag that allow you to set your own goals and targets for stability.

The best way to consider stability targets is to think about the behavior you want to encourage and how to best enable your teams to agree on what action needs to be taken next. That means setting two numbers: critical stability and target stability.

Critical stability: drop everything and fix

Let’s go back to our car analogy. If you produce one thousand vehicles, and one percent of your customers immediately get flat tires, you’ll have ten angry customers on your hands. Do you think that’s too many customers to potentially lose?

The same principles apply in software. We know there will always be bugs (and that’s okay, by the way, so the real question is: how many bugs is too many?

The answer to that question will help you determine your critical stability- your team’s SLA. For example, if you set 99 percent as your critical stability, that means a full one percent of your customer base is experiencing problems when you hit this lower threshold. Since there’s a one-to-one mapping between stability and customer happiness, you can assume these customers are having a bad experience and will likely drop your app.

This critical stability target should be an easy number to rally around.

It’s the point where your whole team agrees that the product’s on fire and needs attention immediately.

Every engineer stops what they are doing, halts their work on building new features, and fixes bugs.

Target stability: balance when to fix bugs vs build features

If critical stability is your “sh*t is on fire” moment, then target stability is your aspirational goal. It’s your SLO - a metric that many organizations communicate externally to set appropriate user expectations.

Realistically, we all know it’s impossible to achieve 100 percent stability, especially if you want to move quickly. Instead, companies should strive to accomplish a delicate balance between developing new features to stay competitive and maintaining a stable and crash-free app.

That sweet spot can only be reached through a constant trade-off between speed of innovation and stability. If you aim for perfection, you harm innovation because bugs are always an inevitable outcome of new development. If you move too fast, you risk your stability and may hurt your customer base.

Ask yourself: what’s our real goal? Do we want to innovate faster? Can we spend less time fixing bugs? By agreeing upon a target stability where everyone believes you’re doing great, you can balance the need to fix bugs with the drive to build new features.

Step 3: Use a collaborative process to set stability targets.

Right now, you’re probably thinking to yourself: how does one come up with these target stability and critical stability numbers? Great question.

First of all, not to worry: stability targets aren’t something any one person is going to know how to set precisely from the get-go, and everyone in your organization will likely think about it in a slightly different way. Here are some tips:

  1. If you’ve been using Bugsnag for a while, you can use historical data as a guide. Look at your successful (and less so) releases and what those stability scores look like. Bugsnag also provides smart defaults that are based on a 30-day average of your stability scores.
  2. If you don’t have any data, here’s the good news: you can simply pick your targets! It really doesn’t matter what you set them at today. Select something that makes sense as a starting point and begin to measure against it.
  3. Adjust your stability targets over time. You’ll start to get a feel for what the right goals are for your apps, for your teams, and for your customers, which will help determine when to take more risks versus fix bugs.

Stability targets give engineering leaders and product owners common ground to have a conversation around goals. These discussions will almost always be a negotiation, and stability targets will evolve. The point of adjusting your targets is to allow you to understand where your stability is at right now while keeping an eye on how and when to move faster.

Step 4: Establish ownership that promotes organization-wide commitment.

You might also be wondering who should own your stability targets. There is no right answer.

Some argue that product leads should take ownership since they understand the revenue impact of bugs and crashes and the importance of building new features. However, product teams often need buy-in from development. And, typically, products like Bugsnag are brought in by engineering teams that understand the business and have high-functioning relationships with product teams.

In an ideal world, you want to achieve some kind of middle ground because stability is important and touches on pain points for multiple teams. With stability as a shared metric, product teams will better understand technical debt, and development teams will have a stronger recognition of how stability impacts the product roadmap. Any engineer who is tempted to think, “I’m just working on the next feature,” will instead start to consider why these features are being built and the impact they’ll have on the business.

Therefore, while one person may lead the charge, the goal is to have both teams contribute to the conversation and committed to the goals. When everyone is aligned with metrics and targets, the outcome is bound to be stronger stability and better customer experiences.

If the goal is quality, the measurement is stability.

When we talk about stability, what’s really being discussed is code quality and product quality. Everyone knows quality is a good thing, but historically, it’s been challenging to measure it with any degree of accuracy.

Stability scores change all that. You now have a tactical method to measure and talk about quality. Coupled with stability targets, your product and engineering teams can decide when and how to move that slider between developing features and fixing bugs, and both teams do so with a complete understanding of the impact these decisions have on stability.

And that’s what we call a stable relationship.