Does the emergence of feature flags affect the interpretation and utility of DORA metrics?

On this week’s episode of Dev Interrupted, host Dan Lines and Ariel Perez, VP of Engineering at Split.io, discuss the state of DORA metrics and whether they need reimaging in a world of feature flags. Listen as Ariel explains why he believes feature flags are more than a tool, and have begun to reshape our understanding of software development and the metrics we use to measure it.

Dan and Ariel also touch on how feature flags can drastically reduce lead time and mean time to recover, and conclude their chat with an intriguing look at the granular nature of control in the modern software engineering landscape, where the unit of control has shifted from the application as a whole to individual features.

Episode Highlights:

  • (2:05) Introduction
  • (7:15) Pursuing productivity at the expense of effectiveness
  • (15:50) Project allocation discussion
  • (24:55) Reimagining DORA metrics
  • (31:30) Radical ideas for DORA
  • (41:25) How teams should invest in shipping faster

Episode Transcript:

(Disclaimer: may contain unintentionally confusing, inaccurate and/or amusing transcription errors)

Dan Lines: Hey, what's up everyone? Welcome to Dev Interrupted. This is your host, Dan Lines, LinearB co-founder and COO. Today, we're joined by Ariel Perez, a VP of Engineering at Split.

Ariel, welcome to the show.

Ariel Perez: Thanks for having me, Dan. I'm very happy to be here.

Dan Lines: Yeah. Super awesome to have you on today.

Really been looking forward, to having this conversation. You're really passionate about engineering effectiveness. Which for me is something that I care a lot about. We care a lot about at LinearB and I know you also have thoughts on DevOps and some DORA metrics and of course, do the split feature flags and how they play into these topics.

But before we jump into our main topics today, can you give the audience a background about yourself and how you got into engineering?

Ariel Perez: definitely. Thank you for that. I'm born and raised in New York. From background I got into engineering even I think as far back as high school dating myself a bit.

That was the day. Those were the days where DHTML was a thing, so it got me really into playing with CS and JavaScript and making things move dynamically on a page and really pulled out my passion for programming. So I went to college for programming. I was a computer science major. At that time jobs in New York City when it came to technology were primarily in the financial services space.

Facebook had just started. Google had only just been around, so people didn't really go to thanks just yet. Amazon was barely new as well. So I worked in the financial services, did that for many years, grew to many roles, individual contributor, managing teams, lead engineer. I got to travel around the world quite a bit, working with international teams.

Then decided to change it up and start going, go back to startups, smaller companies. So bounced around a few startups. Built my own startup for a while. Was consulting, then came back to big company. I was working at JP Morgan Chase on the chase side. That's what got me really passionate about feature flagging cuz I got my opportunity to build my own feature flagging platform there for use at scale.

Fast forward a few years, got to take that to the UK and built Chase in the uk. Awesome experience. Building a digitally native mobile native bank, and then landed here at Split, really putting together my ability to understand feature flags, having built my own and a passion particularly for measurement and learning.

Dan Lines: That's amazing. One thing that I really like about that background, oftentimes we'll talk to people more so like on the West Coast and that startup vibe, and maybe I'm jumping from company to company until like I hopefully get like a big hit. But you were saying like in, you were in the New York City area, so maybe some of those like financial firms and larger, organizations, What did you learn the difference between a startup and a, much larger company?

And the reason that I'm asking you, that is now of course, you're one of the VPs of engineering right at Split and a lot of people want to, progress their career to get to that type of opportunity. So any tips on how we can do that based on your background?

Ariel Perez: Definitely. I think, we can speak about broad, in broad strokes, but in general, one of the key differences I saw of when in my first move from big company financial services to small startup, there's only you meaning, aren't more people coming, there aren't more people available.

So you tend to do everything. You become a jack of all trades, which is, can be challenging at first and maybe daunting, but also you will learn faster. Than you've ever learned before, versus generally being in a larger company where your role is a lot more clearly defined. You're in your particular area and they have experts in different areas.

Lots of special specialization. So that's the first big thing. I think the second big thing that I tend to see is I. Bigger places tend to move a lot slower. Decision processes, take longer, approvals, take longer, getting projects defined. They might not be as nimble or agile in how they define projects and get them developed while in a startup is cool, can you ship it tomorrow and you figure out every possible way to ship it tomorrow.

A lot more ability to move fast and remove a lot of processes while also maintaining a lot of rigor in your decision making. And if I were to say one last thing is that. Large companies really care about efficiency. Productivity and efficiency are the things that really stand out to them.

So they'll start thinking about how to invest in tools that help make engineers much more efficient. Cuz they're so effective, they really care about the bottom line. Yeah. Startups maybe don't care as much, or at least in this climate now, they definitely do this climate of not growth at all costs very paced growth and measured growth.

Efficiency and cost is becoming a lot more important for smaller companies as well.

Dan Lines: Yeah, that's actually the perfect lead in to our first topic, and thank you for sharing some of those insights in your background. And I couldn't agree more, especially with those, if you're scaling, let's say your engineering team, you're growing your engineering team or you're working at a large organization, certainly the business is going to ask of you, especially in this economic climate, what are you doing for efficiency?

Where could we cut costs? How do we make people more productive? Because the more engineers that you have on the team, that efficiency can multiply by a hundred x, 500 x, a thousand x. Yeah, I think that's a wonderful point. Some of the things that we were chatting about in like our pre-production meeting and before we got on here, you said something like, teams are often pursuing productivity and efficiency at the expense of effectiveness.

Can you unpack what you meant by that or that type of mindset?

Ariel Perez: Absolutely. And actually it ties very closely to what you just said, where at a bigger company you might say, let's hire more people, add more engineers, get the more value out of that. There's a big aspect of large companies especially, but also it's bled into every other company.

As we think about software development, one thing that has ruled a day for the last, I don't know, almost a hundred years, is this idea of taylorism. Taylorism came out of factory work, right? How do we get very efficient at getting the most value and squeezing out the most value out of every single part of this factory that is generating things, building things, that worked in a world where everybody was building widgets, right?

And every widget is exactly the same. I cast a die once and I ship the exact widget over and over. 

Dan Lines: This is manufacturing, right? I need to manufacture a widget for a billion people. What do I do?

Ariel Perez: Exactly. So a lot of those ideas came into management space and how we think about managers and how we think about people and people are resources.

So how does that translate to the engineering world where I need more engineers. Ideally at a lower cost, and I need every engineer. Working individually so I can maximize how much I get out of each engineer because they're an expensive resource. Now the key thing that, that misses, so yeah, you might be very efficient doing that might be.

And we can talk about, and your product is great a highlighting where you're not efficient, but you're definitely gonna be productive. You're gonna be busy, let's keep them busy, let's keep them maximally utilized. Now some challenges that occur when you do that. And any. People working in engineering, you'll know what happens when your machine and your C P U is at max utilization.

You know what happens when your RAM is at max utilization? Things fall apart very fast. So the first key thing is software engineering. And in general, knowledge work is not like manufacturing work. You cannot reproduce the same exact widget over and over again. You get the same people, the same problem, the same timeframe.

To try to build the same thing and you're not gonna get the same result because the context is different. It's always different. Software engineering is complex. That's the key piece that we fail to understand. Manufacturing is simple, might be complicated, but software engineering is complex. And what complex means is that you don't know how the parts work together and how they interact with one another and you change one thing.

You don't know what's gonna happen. You add one thing, you don't know what's gonna happen. You can try to guess, but you will not know. So with that mindset then saying, let me maxim, let me max utilization for every engineer and lemme try to reproduce the exact same thing. Doesn't work. What kinds of things do you run into when you try to do that?

Let's maximize work in progress, right? Five engineers take five stories in a sprint, assuming Scrum and they all work individually. They're maximally utilized. But what then what happens? Engineer one opens a pr. Now engineer one has to wait for engineer two or three or four or five to go review that pr.

So they sit and wait. What do you do when you sit and wait? You go start a new task, right? Because you wanna be productive, you wanna be utilized. The other engineers have to stop what they're doing to go review that pr. So now I stop, I go review a PR in the best case, PR closes and we ship. Most cases, you go back and forth a few times.

So imagine how many times we're interrupting each other and we're waiting for each other. So there's a lot of wait, wait time in there when you have max things in progress. So maybe reduce work in progress.

Dan Lines: Yeah, just a comment there, because honestly you're hitting on a problem that's like near and dear to me, so I have to make a comment here before you go. The other thing about the PR and the situation that you're talking about is every poll request is different. It's not like you're getting the same change set every single time.

Like in a factory, for example. Yeah. The thing that's being passed, like from machine A to machine B is exactly the same. Every PR is completely different. The risk of the pr, the size of the pr, who needs to review it, what does it impact? And that's why like I, I put a bunch of my passion into this gitStream workflow automation tool that yeah, I've been working on because of that exact, problem.

And that's where I think, probably where you would say there, there's. There's a science to engineering, like to software development. And there's an art. Absolutely. And the art is identifying, okay, everything is actually different. What can we do for each one of these different sets? So just definitely had to comment on there back to you to what, to where you were going.

Ariel Perez: Yeah. Thank you. No, and again, actually I'll continue on that just for one moment. Let talk about gitStream, right? Things that gitStream is aiming to do is trying to find. Define how simple, how complicated or how complex this particular PR might be, and help you optimize that right?

Now, here's the thing we'll say is a drawback not of gift stream, but this whole PR based flow, it still requires generally asynchronous. Review of code one engineer works on this code, a separate engineer. When I'm done later, reviews my code and we'll go back and forth. So what do many tools try to do to optimize that and make it more efficient?

Give you rules, make the PR smaller, reduce the scope. There's so all these things about make these things smaller so that review and that asynchronous out of context review is easier, which is great. Let's improve that. Cuz you know what? 99% of teams, this is what they're doing. So let's help them do that.

But then that gets me into the second thing. Beyond work in progress, it's actually a lot more effective. And I'll talk about why effective to have engineers working together. So one approach might be can two engineers in real time review the PR together? That's often much more effective than I open my pr.

I walk away, you come review my pr, and you have asynchronous communication already. With the asynchronous communication. There's wait time. We miss each other. We might not catch what we're saying if we just sit and talk through this thing together. So much more effective. You close that PR a lot faster, but let's take that even a step further.

Why have PRS at all now, that might sound like blasts me to some teams, but how can I potentially remove prs? While keeping the things that we want from prs, quality control, sharing best practices, learning. What if two engineers work on a story together? That sounds like a radical idea to the taylorists because no.

Hold on. I have two engineers. I pay them a lot. If they're both working on one story, I'm getting one story out of them. What they're thinking about is the very short term transactional cost of two engineers working on one story. What they're not seeing is that software engineering. You pay the cost at the beginning and then you have maintenance costs for the rest of the life of that software will be changed by someone else again.

And if you have two engineers working on it, the likelihood that you'll reduce the lifetime cost of that code drastically goes down. Cuz here's, I think, something great I've seen in your product. You can, the rework rate. You'll probably find that rework rate drastically reduces when you have more people working on that story at once.

So you might pair, you might even ensemble or mob program on that thing. Three to five engineers. Oh my god. Super blasphemous, right? Five engineers working on one story that cannot be efficient, that cannot be productive. That's right. It's not, but it's effective when they ship that story, odds are, they'll never touch it again.

From a bug perspective, from an issue perspective. So when it's done, it's shipped, it's out. Onto the next thing. No PR needed merged straight to trunk, cuz you have that level of quality and teaching across the board.

Dan Lines: You know what it reminded me of? I think something like either, like pair programming or extreme programming. There were some of these movements, early on in Agile to say, Hey, let's get some of these engineers working together. let's put the cost more upfront instead of later on in the life cycle.

Yeah. Do you have any? And It sounds great, right? It honestly sounds great, but at a large, I don't know, an engineering organization that's not used to doing that, right? Yeah. I'm not used to doing that. I'm working at, like you say, a New York City, like a large I don't know, financial company or something like that.

And by the way, I don't know if you do or you don't, but getting to a situation, to getting more of that upfront cost ear earlier, earlier on, have you dealt with that at all? Or is it a transformation? Is it a mindset shift?

Ariel Perez: All of the above. Yeah. It's not easy, right? I think so. There are a few things that we've done in this industry, in engineering, on top of the fact that most management in any company is also taylorist engineers. We tend to like, stick to this trope that the engineer is that person that likes, Sitting by themself and doesn't like talking to people, right?

Like I work as an individual contributor. You've got this trope of the 10 x engineer who comes in and builds everything by themselves and I'm amazing and I save the company. Yeah. A lot of these things push people to a culture of heroes and individual heroes of people who don't like working with one another.

And we kind of self-select and hire those kind of people. So that's one challenge you run into just your teams. I am an engineer cuz I like working by myself. It. That's a trope and you self-select for that. So that's one challenge to work through. How to get you to work together, different ways to work together, and you can define working together and truly co collaborating as opposed to cooperating.

That's one thing you have to contend with. You definitely have to contend with the management of fighting. That idea that, hey, five engineers working on one thing is better than five engineers working on five things. That's a hard one to sell.

Dan Lines: Five things sounds better to me. I'm a ceo. I'm doing five things.Now you're telling me to do one thing?

Ariel Perez: I'm gonna do one thing. Exactly. Yeah. With all these expensive engineers, right? It's this thing about, this fallacy that seeing progress is better than seeing completion. I got things shipped as opposed to I'm doing things, I'm busy versus I'm actually having impact.

So in general, for anything like this, whether it's to management, whether city engineers to the whole organization, the things that really help is, this is an experiment. We're gonna try something out. Let's test an idea, just like any team that has retros, the idea is the way you improve is continuously you introduce new ideas as something to try out.

And what's the key thing about something to try out is you define key success criteria. What do I wanna see out of this at the end of it to save it's successful or we want to continue it. And when's a stopping date? Everybody, generally, everybody can do something that they don't like if they know they can see an end in sight.

So if you couch it as an experiment, let us try this out. And then what are the evaluation criteria to continue? And you make those experiments small. You find that you can start making those changes. But these are radical changes to both the organization and to engineer. So you introduce 'em in small steps and keep trying, iterating on different things.

Dan Lines: While you were talking there, there's a few things that came to mind to me. If you're someone that you know is relating to these concepts and want to try it out, and I'm thinking about the business cuz one of the, one of the things like all of us engineering leaders like deal with is the business.

These are, very smart people, non-technical people. Oftentimes you're talking to CEO or you're talking to like head of sales and you're talking about new value and that type of stuff. And you did mention that those types of people usually think of their, company in re like resources or like the allocation of resources.

So a few tips that I could give the, audience here or what came to mind to me when I'm thinking about resource allocation one, that's one thing that you as like an engineering leader should be having a conversation with like with your c e O. So h how are our people allocated?

And a common mistake that I'm seeing within the industry is once you have that allocation report, to your point, AR Ariel oftentimes I see, okay. We're working on 10 different projects in parallel with a very small amount of engineers and therefore each project is progressing at a very slow or a non I guess we would say effective pace.

And so what you can do is show, okay, we're really spread thin. Let's make I, I propose a change here. What are the top three? And I'd like to allocate more engineers to those top three and take one of them and introduce the concept that you're talking about. Yep. As an experiment, because hey, business, this is, a project that is most important to you.

I wanna deliver it as effectively as possible. And the other thing that came to mind for me on the allocation side is what most engineering organizations are doing today is thinking their, of their allocation is okay, there's keeping the lights on, there's new value delivery, there's enhancements to the thing that I already, released.

And then there's like internal like productivity, like working on your Dev workflow. Yeah. If you see that your enhancement percentage. Is very high, which means, hey, we thought that we released this thing, or we did release this thing, and we're still actually enha, like enhancing it, working on it like six months later, a year later.

That's also, I think, an argument for your case of, Hey, we're now paying this long cost. This long term cost instead of doing it more upfront. So a few tips. Yeah. I don't know if any of those resonate with you.

Ariel Perez: Oh absolutely. They do. And I think the first one, one thing I found that becomes very easy, that everyone understands, forget for a second, speaking about 10 projects, 10 engineers.

The simplest layman's term version of this that even my grandmother can understand is what happens when I need 10 houses. I can start building 10 houses all at once right now. What happens to my risk, they'll all progress. And if I run out of money in a year, I might have 10 incomplete houses.

Instead, maybe I focus on building three houses right now, and in a few months I'll have three houses, and in a few months I'll have maybe six houses. I run out of money. At the end of the year, I'll have six full houses instead of 10 half-built houses. It's the same reality in any project allocation is working.

Batches work on smaller things. Get more people focused on them, get them done thoroughly and effectively. So same way to explain that same concept.

Dan Lines: Oftentimes, the way to present like an allocation of resources is by like type of work. And you're seeing that enhancement be very high. It means that you're paying a long-term cost instead of the up upfront costs that you were talking about.

Ariel Perez: Absolutely, and the way I think about that, there's two things. One, As an engineering leader, we're pushed to talk about in these terms how much you're allocating to hear, to there. A lot of that often comes from finance, right? Things that engineering leaders learn is about, amortization and cost and the kind of when you talk to your accounting and finance people talk about taxes, you break it down that way.

What's lights on versus what's innovation? But in reality, when, if you start thinking about how you organize your team is get them together to solve the most important problems. and yes, you talk about that long-term maintenance is what do you wanna invest upfront? Do you wanna focus on quality and value, or do you wanna work?

Focus on efficiency and cost, focusing on efficiency and cost too much early on. Guarantees that your costs go up over time. Guarantees it if you focus on efficiency and cost, cuz you're gonna get the cheapest resources, spend the minimum amount of time possible rather than spend the time to get the quality and value and reduce your cost of change for the rest of li of the life of that software.

If you invest upfront to reduce the cost of change, every incremental change and it will come, will be much cheaper over the long run.

Dan Lines: It's almost like a, there's like a education process here, if you are a VP of engineering, and usually you need to justify this way of working again.

To a CEO or to your senior leadership team a little bit because when you did the house example, It was amazing. That made sense. But if I just come to you and say, and I'm reporting to you Ariel, and you're, you're the c o yeah. Instead of doing 10 projects, I'm only gonna do two three.

That sounds, that doesn't sound good. Yeah. So there's also like how you present it, the approach come with some of that data. I think it helps a lot. I'm excited to transition us cuz I see like our second topic has a really interesting title. It's titled, re-Imagining DORA Metrics and Leveraging Feature Flags.

What does that. Mean to you? I think a lot of our audience knows about the DORA metrics, right? Yeah. So we have cycle time, we have deployment frequency, we have change failure rate, we have MTTR meantime to restore. What does it mean to reimagine these DORA metrics? Yeah.

Ariel Perez: Got it. So I have two versions of that.

One is the more straightforward one, one is the more radical one. I'll leave the radical one for later when we get to it. 

Dan Lines: I like the radical, we gotta get to the radical one.

Ariel Perez: But the first one is this idea, DORA Metrics came in a world where, we were just trying to ship faster, right?

And in a world where deployment and release was the exact same thing, in order for you to ship a change to a customer, It was a release and a deployment. Same exact thing. I had to put the software in production and the world has changed a lot since the DORA metrics were published initially. And again, DORA metrics are amazing at trying to figure out which teams are very efficient at shipping software that doesn't break shipping software that's high quality.

How good am a is an engineer organization at building stuff and shipping it? The key thing is what happened with the world changing is over the last 10 to 15 years, but even more the industry really even less so over the last 10 years, feature flags as a term, as a concept for building my applications right.

And as not assuming that everyone knows exactly what a feature flag is, although it's a lot more ubiquitous now, it's a toggle, it's a switch, it's a feature flag. It's basically at its core, an if statement around your code. If this feature flag is on this code path gets executed. Otherwise, it's off sounds, sounds very simple, but the radical aspect of it is now I can actually ship code off and at a separate time, turn it on.

You have different ways to do it. So many their UIs, their database implementations, config implementations, but that's the key idea that I can just ship the code and keep it off. Now, why would I do that? I need the code to be on. There's so many things that you can test and validate while the code is off before any customers see that code.

You can imagine how many times you think something is perfect in staging and then you put it in production and it explodes. You miss some config, you miss some issue feature. Flagging it off allows you to ship it out. Without it breaking. The other thing that feature flagging really allows you is trunk based development branch by abstraction.

You can just commit your code and not wait for these long-lived branches that just carry risk because they're not being merged. So now how does that change the DORA metrics? The idea then, now in, in a feature flag world, it's not enough to think about how often you're deploying. The idea is how often are you releasing?

Cuz you can deploy over and over again with the code off. One things that feature flag has helped you do is deploy more often? Yes. That's great because if you're feature flagged off, you can merge more off, then you can merge to trunk and just ship the code. So your deployment frequency skyrockets, but you're not delivering anything to users yet.

So with feature flags now, you might turn off, turn on different features at different times. So then how do you define deployment frequency now? Is it change frequency? Is it change enablement frequency? So it's something to think about there. If I ship one piece of code, And then I flip a flag next week and I flip another flag the following week and I flip another flag.

The last week, did it have one deployment or did it have five deployments? Is it five change? Is it change frequency now? So that's one thing that comes to mind. Yeah. The other thing that comes to mind is I think lead time drastically goes down when I can feature, flag everything off and merge it.

But the other major impact is meantime to recover. For elite engineering teams, meantime to recover is primarily constrained by how quickly can I ship a fix. But in a feature flag world, it's how quickly can I flip a flag and turn it off your meantime to recover can go down to seconds in the best feature flagging platforms.

So that's another massive impact on your, yeah, on that DORA metric.

Dan Lines: Yeah,and these are the non-radical thoughts, right?

Ariel Perez: This is non-radical, at least in my mind, right? It might be.

Dan Lines: No, I, we're gonna get I gotta hear what the radical are, but the, no I think it's really cool. I think, first of all, like feature flagging I agree, changes the paradigm a bit.

I think the DORA metrics, obviously they're. They're great, but they try to describe something at the highest level possible. You talked, for example about, I think you said change frequency and for example, in the cycle time pro process, there's almost a frequency of every stage. So there's like a merge frequency.

There's you could say okay, there is a deployment frequency, but then that's with the feature flag off. Now there's a frequency of how quickly are we entering the stage of that feature being turned on. Exactly. And what percentage of your customer base is it turned on for? And what percentage, or sorry, what time does it take to enable it to your entire customer base?

So for example, I, I'm honestly thinking about this all the time. We're re in LinearB we're releasing things to, let's say like beta, which means okay, it's behind a feature flag and it starts out with zero of the cust, 0% of the customer base has it, and then. It eventually increases to everyone.

But I'm saying to myself, how long does it take to go from that first release where there's actually no value for the customer? Cuz everything's off to full value for the customer base. Exactly. So I think we can start, Putting these like stages into cycle time, we can start, staging out deployment frequency and the control and therefore, ho honestly like the metrics around it that feature flags give because I can tell how many are on or off for my customer base, right?

Is amazing like that. That's great. Let's just bake that into cycle time. More stages, deployment frequency, more stages of deployment.

Ariel Perez: Exactly. It's more, it's just the world has become much more granular, right?

Dan Lines: Yeah. It's not black and white. It's gray.

Ariel Perez: Yeah. There's a, the unit is a gradient. There you go.

The unit of control was the application. Now the unit of control is the feature. That's the key thing.

Dan Lines: Yeah. I love it. I love it. and what I love about it is like the, it's more about value. It's more about value, like the feature absolutely in itself has like usually value to it.

And so you can measure that progression of value.

Ariel Perez: Absolutely. And that actually, I think that's a perfect segue into my radical idea, right? First you said it's a measurement of value and then you said a feature. And you quickly caught yourself, usually has value. I think that's a thing that you can talk about for a very long time, right?

We tend to assume that every feature has value. It definitely has value. We have, we've done research, we've talked to customers, we've validated this, and we're like, great, this thing has a certain amount of value that I'm expecting when I ship it. Yeah. And then we put it in prod and reality hits you.

If you're willing to accept reality, a lot of people say, look, I'm just gonna ship it. We spent the time doing it. the reality is actually that, and I've gotta figure out the exact stat, but it, this is in the ballpark, about 70% of everything you put in production has either little or negative value, right?

70%. Imagine that's crazy. Yeah. 70% of everything you ship has little no or negative value. So only 30% of what you're doing actually has value for the customer. It's unbelievable. So how do you think about that idea of value? So the DORA metrics and even the reimagined world of DORA metrics with feature flags and that granularity help you ship faster.

So if you take the idea of value, it helps you ship 30% of value faster. There's that, there's actually a win there, right there, there is actual benefit to that, and we should still chase that. As engineering organizations, let's get really good at shipping. Why? Because odds are we're gonna be wrong.

So the more we ship, the higher the hit rate. If I ship a thousand changes that at 30% versus a hundred changes at 30%, odds are I can get a better hit rate. I can just get more value out there.

Dan Lines: Something, I learn something every time I share. That's an increase my percentage chance of getting to value every time.

Ariel Perez: That's it. Yeah. So then that's the key piece that I'm gonna get to in terms of the more radical ideas. Often teams that just think about the mechanical aspects of shipping, if you're not measuring, it's really hard to learn. So the idea is how do we measure the impact of every single feature that I ship?

So you talked about one way of thinking about like telemetry you get from feature flags, how many users have it on? But that is just a measure of how many, what part of my population is at risk. But it doesn't tell you anything about what is the risk, how are they behaving, how are they acting?

What's happening? Should I roll back? Should I roll forward? I'm watching maybe my dashboards and monitoring tools to help me understand should I roll back. So I've gotten really good at changing shipping changes, rolling back changes. If I can put it in one way, we've gotten really good at building the things right.

Building the thing is I can ship it fast, it doesn't break things. It is scalable. Is it resilient? It's performant. All the engineering metrics that you would look at about is this good Software does software work, but it tells me nothing about they'll rebuild the right thing. So what I care about in terms of that radical idea is how do we get really good as teams to bake in?

Actual measurement of impact of every change, whether it made things better to increase that hit rate from 30%, maybe increase it to 40 or to 50, and accelerate that learning. If I'm not learning on every change, I'm just shipping more crap out. If I can learn, I can truly improve my hit rate. And those things are like interest.

And let's get back to financial services. It's compounding interest. Everything I learn if I learn at a 1% rate every single day. The compounding of that learning is massive by the end of the month. So we can really learn about our customers. So then what does that look like in a feature flag world?

When I think about deployment frequency, and let's call it change frequency now, right? New terms. How do I know I'm ready to go from 1% to 5% or from alpha to beta? Today I'm watching. I'm maybe looking at some dashboards and nothing's blown up. No customers have complained, so I feel safe, I can roll forward.

You said something very interesting though. How do I get from one to a hundred as fast as possible? It's really hard to do that without measuring, and this is where causal analysis comes into play. If you can feed telemetry into a feature flagging platform and experimentation and measurement learning platform that tells you this feature flag is having the intended impact that you wanted or the unintended impact, here's what happens.

So let's say you roll out a feature to 1% of customers. Your APM tool that your engineers are looking at will tell you nothing's wrong. No explosions, no problems. I feel safe. Here's what. Analysis can tell you for that 1% of users, there's an 80% increase in errors. That's the key thing. So it'll tell you, don't go ahead, don't go forward.

As a matter of fact, roll it back. You can make that decision much faster because with causal analysis under the hood, immediately that feature is having an issue. If you've rolled out a hundred features, that one feature is the one that's causing problems on that 1% of the population. You've mitigated that and you've quickly rolled that back, rather than waiting around for your APM to tell you, cuz you might have to go to 20% or 50% before you start seeing something in your apm.

Cuz that's just correlational data. So that's the one of the things that says this'll help me roll out faster. But I'll tell you one next step and then I'll stop talking for a moment, right? So we can go back and forth, which is what if your goal with this particular change was to improve performance, you improve the load time on your screen.

Or let's say you actually want increase, checkout conversions from a business perspective. What if your feature flag is measuring that and says, Hey, for that 5% of people performance actually increase It dropped from, three second load time to one second load time. Do you wanna wait before you roll that out?

No. The system, you tell the system full steam ahead, go roll that out to 5%, roll that out to 10%, roll faster. So that data as it's coming in, and if I can tell you that for that percentage of people, performance is much faster, or cart checkouts went up, go ahead, roll it out. You can get from zero to a hundred much faster with the machine learning along the way and the causal analysis leading the way for you.

Dan Lines: I think about it in, in two ways. It's for the customer base or the service base that I rolled this out for, what is the change within that scope? Exactly. So is it better performant? More performant or less performant, whatever you're monitoring. And then the  other thing that I was thinking about is there is again, like I think a customer value, kinda.

Art to this? Yes. And I'll explain what I mean by that. Some of the times I can just do my own experience. Some of the time at LinearB, when we're utilizing a feature flag, we put that feature flag in the hands of our sales team. We put it in the hands of our customer success team, and we say, Hey, at your discretion you can choose to turn this on or off.

For, X amount you each have five tickets to turn it on or off for. Yeah. And it comes with a caveat, an expectation setting. There could be bugs, there could be a performance issue. It's you to choose. And what I notice is, Because you said earlier I think if we went with intention, usually everyone, if we went with good intentions, the idea that I have, the thing that I want to ship into production, my intention is it does something better.

Like I Exactly. If we put like malice aside, like I'm trying to ruin the comp, like Nobo, most people aren't doing that. I have an intention that it does something better. And I see, at least for us, the feature flags that get turned on the most rapidly by sales and Cs or the customer bases asking for them.

Yes, please enable this for me, like immediately. Yeah, usually correlate to more value. once in a while, full transparency. We will come out with something in, in product that we think is really cool. We did our validation. We thought it's the right thing. We go and create this feature, flag.

We tell sales and CS all about it and then, it's turned on for 5% of our customer base. We're saying, Hey, why is this not rolling out? We have a mismatch in value. We ask the customers, Hey, do you want me to enable this for you? They say, yeah. They're like, no, doesn't really matter.

Yeah, they don't care. Yeah. So I think it, I think the cool thing about it before that we move on to our last topic here is I do think like the speed of rollout can correlate to. Internal value from your sales and CS department and your customer value. I think there's some absolutely some type of correlation there.

Ariel Perez: There's an absolutely a strong correlation, right? I think the first part of it, when you're rolling it out to some customers through sales and customer success, I call that's more like of a qualitative state, right? It's alphas and betas, and you're trying to get early feedback from a trusted group, and yeah, that feedback helps you continue rolling forward or say, oh, no, we need to shut this off.

Nobody cares. At some point when you're ready to expand it beyond that initial group, just gathering telemetry on, are people engaging with this feature? Are people using this feature? Is this driving our business metrics? That's the thing that helps you roll it out past that first beta group too. 25% to 50%, but the other side will be, if nobody cares, if nobody's using it, roll that back to a hundred zero.

Because I think going back to the long-term cost conversation is why am I gonna put that feature in prod that nobody uses if my long-term maintenance, I'm gonna keep paying for this thing for the life of this feature, even though nobody's using it. I've got other things to spend my money on. Let me spend on better hits.

Let me increase that 30%.

Dan Lines: Interesting stuff. To close it out here, we have a, maybe a summary topic. Around achieving desired outcomes, somewhat the state around, should teams invest in shipping more or shipping faster, is it worth it? I have a paraphrase quote from you here that's saying many engineering teams are not near where they're metrics should or could be.

what are your thoughts on achieving, some of these, efficiency, effectiveness metrics within the engineering community?

Ariel Perez: I think I'll try to sum it all up in all these pieces of the conversation. I'll start with we ideally want every engineering org to be outcome driven.

And work and really drive toward actually having the expected outcomes and drive impact improving the lives of our customers and have engineers embedded in that conversation beyond just our world of infrastructure and performance and latency and throughput, is is anybody using this feature?

Your customers getting value? So how do we wanna go toward that direction, right? We can only make better decisions. And truly move the needle for the industry, for our companies, for our customers working in that world. But what does it take to get there at the minimum, you have two sides of this coin.

One of them is we've gotta be elite engineering organizations. We've gotta get really good at shipping, fast shipping often, because we're gonna fail often. But let's get really good at fixing fast and let's make sure we balance that with quality while we're building it. The other side of it, and that's Dome reimagined.

Let's get really good at shipping and iterating. The other side of it is, let's get much better at being effective engineering teams and to be effective, we've gotta work together, we've gotta collaborate, we've gotta think together through the problems to come up with better solutions. So those are, I think those are the key things.

It's moving to outcome as opposed to output driven organizations. Using the ability to get really good at shipping and shipping quality faster. Cuz that's critical to get value in front of customers and then to get better things out there. Let's work together while we do it.

Dan Lines: I think it's a good summary and obviously, you got some really cool stuff going on at Split.

What I like, I'll just say what I like the most about feature flags. We did talk about some of the more futuristic stuff. But at the end of the day, I think it kinda lowers my barrier to go experiment and get something out there and shortens my feedback cycle. Makes me confident that I can try something.

My learning increases cycle time decreases deployment and change frequency increases. These are all shown to be positive things. And Ariel, I wanna thank you for coming on the show and like enabling the community to be able to unlock these DORA metrics. We love giving our guests the opportunity to bring awareness to something that you would like the audience to hear about. Is there anything interesting going on at Split that everyone should know about?

Ariel Perez: Yes, definitely. Thank you for the opportunity, Dan. We always have many different events and many different things going on at Split, but I think for right now, if you wanna stay tuned, then understand, hear about the best content in feature flagging and experimentation and measurement, and how you can really get the most value out of feature flags and measurement within your life and in your engineering practices.

Definitely try, www.split.io/blog. We update that blog regularly. Follow us on LinkedIn. We're on different social media channels on Instagram. I think we have a TikTok channel too. and if you really wanna, interact and connect with us, we've built a very robust community on Slack. And that's split community.slack.com. Join us there and just talk about feature flags and seehow we can share knowledge with each other.

Dan Lines: Sounds great. So everyone, if you're interested in this feature flag stuff, definitely check out Split and also their Slack community.

And everyone thank you for listening. We're gonna catch everyone next week. I also want everyone to be sure to sign up for the Dev Interrupted Substack.. Each week you'll get the latest episode of Dev Interrupted right in your inbox, as well as articles from some of the best engineering leaders in the industry, upcoming events, and all your favorite DI episodes past and present. 

All the insights, none of the fuss. Check it out at devinterrupted.substack.com. And then for the last time, Ariel, thank you for coming on the show.

Ariel Perez: Thank you very much, Dan. It was great having a chat with you. I look forward to potentially doing this again. I think we have a lot to bounce off of each other.

Again, thanks a lot. It was an honor and a privilege. Love to have you the show again.


A 3-part Summer Workshop Series for Engineering Executives

Engineering executives, register now for LinearB's 3-part workshop series designed to improve your team's business outcomes. Learn the three essential steps used by elite software engineering organizations to decrease cycle time by 47% on average and deliver better results: Benchmark, Automate, and Improve.

Don't miss this opportunity to take your team to the next level - save your seat today.