Flow can mean many things but when it comes to workflow it usually refers to that feeling, discussed by Mihaly Csikszentmihalyi, when you enter a state of intense focus and lose yourself in an activity. 

Video games are a great example. They take advantage of this feeling to keep you immersed, which is why it’s so easy for gamers to “lose time” and just get wrapped up. The same feeling usually drives your most productive and best work.

When you manage developers, their workflow should be treasured and valued. That’s why, to improve developer focus, it’s vital to avoid weighing them down with minor interruptions or non-urgent pings. 

“Flow is characterized as this experience where the task that you're doing is perfectly matched to the skills that you have.” -Katie Wilde on the Dev Interrupted Podcast at 7:51

1. Acknowledge that it take 23 minutes for devs just to get into flow

Did you know that it takes 23 minutes to get into a flow state? For some people it takes even longer. That means that for every question, disruption, email, and interruption that you or your coworkers are subjected to, it could be half an hour of productivity down the drain. We talked to Katie Wilde, VP of Engineering at Ambassador Labs, about how she manages workflow

“Say you got a Slack ping, and you're like, “oh, I'll just ask a question.” How long does it take you to find the thread again? What's that total interrupt time? It's 23 minutes…that's been measured.” -on the Dev Interrupted Podcast at 11:11

2. Defrag dev calendars

Some interruptions are unavoidable but many of them aren’t. Planning your calendar in a way that works around the needs and workflows of your team is necessary to maximize everyone's productivity. 

For instance, scheduling meetings on days when weekly meetings already occur can help preserve focus time by not disrupting other working days. 

Devs need to communicate with their managers on what times they have available away from normal workflow and then it’s up to engineering leaders to plan around those schedules. As a dev leader, you have to look at your devs’ calendars, not your own, and react accordingly. 

“If you're a manager, when you're scheduling, don't look at your calendar, and then find a time and then see where you can slot the engineer in…look at the engineer's calendar and see, where can you tack the meeting on that it is after another meeting, or it is maybe at the start of the day, the end of the day… and ask them!” -Katie Wilde on the Dev Interrupted Podcast at 12:31

3. Suck it up - schedule your work around focus time

When managing large numbers of devs, it can seem like a chore to work around many different schedules or attempting to get meetings done only on specific days. We asked Katie what her trick to juggling so many different calendars and meetings was, and she had one thing to say: “Suck it up.”

Devs are the backbone of software production and it’s important to prioritize their productivity whenever possible. To help them stay on task and be able to really focus on their work, they need to have meetings planned around their day - not yours.

Providing consistency for your devs - meeting them when they are ready, available, and focused - helps them maintain a flow state and maximize productivity. But more than that, it’s the right thing to do. Devs want to build cool stuff, not have their days ruined by their own calendars.   

Katie says it best:

“That might mean that, as the manager, you have a little bit weirder hours. I hate to say this, but kind of suck it up… There's no way to get around that.”-on the Dev Interrupted Podcast at 13:23

Watch the full interview-

If you would like to hear more about how managers can work around a developers schedule and other great insight from Katie Wilde, check out the full podcast on your favorite podcasting application, Apple Podcasts, Spotify, Stitcher, YouTube

Starved for top-level software engineering content? Need some good tips on how to manage your team? This article is inspired by Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

Discover Our Most Popular Podcasts
Join the Dev Interrupted discord

In a typical manufacturing company, a supply chain is the chain of companies that you rely on to make your product. For example, a mobile phone manufacturer buys processor chips from a supplier. That supplier needs to buy a part from another manufacturer. And that manufacturer relies on yet another company for the raw metal.

But what is the software supply chain? And how do you keep it secure? We spoke with Kim Lewandowski, co-founder and head of product at Chainguard, to explain the details.

Your software supply chain is more complex than you think

The software supply chain can be complicated. Mainly because it’s difficult to know how far it reaches. Take a simple example: If you use Salesforce to keep track of your customers, you store your customers’ data on Salesforce’s servers. Not a problem, surely? But Salesforce could have a breach. And what about the servers themselves? Those servers might run on Windows. If that has a security bug, hackers have another way in. How about the software that Salesforce uses to host its website? If that is hacked, you have yet another breach.

 

“When I think of the software supply chain, it’s all the code and all the mechanics and the processes that went into delivering that core piece of software at the end,” Kim explained. “It’s all the bits and pieces that go into making these things.” -On the Dev Interrupted Podcast at 11:28

Keeping the software supply chain secure involves checking who has keys

The important part of keeping your supply chain secure is making sure that you track down what you’re using. And checking that they’re secure and reliable. Every new third party can be a potential problem. If you don’t do your due diligence, you won’t know what risks you’re taking.

As Kim explained, a favorite analogy of hers is thinking about doing construction work on your own home.

“You have a contractor. Well, they need keys. They have subcontractors. You give the keys out to all their subcontractors. Who are they? Where are they from? What materials are they bringing into your house?” -On the Dev Interrupted Podcast at 12:09

The more third party tools you use, the more out of control it can become

It all comes down to accountability. It can easily start spreading rapidly. One third-party tool that you use to create your software might rely on five separate third parties. And you don’t know what code they’ve got hidden under the hood. Your keys are suddenly all over the place.

The only way to keep it under control is to remind yourself to check and to do regular audits of the services you use. Kim believes it’s helpful to think of every new tool as a package coming to your home.

“How is your package getting to your house?” Kim said. “What truck is it riding on and who is driving those trucks?” -On the Dev Interrupted Podcast at 12:44

Get the full conversation

If you’d like to learn more about the software supply chain, and how to make sure that yours is secure, you can listen to the full conversation with Kim over on our podcast.

Starved for top-level software engineering content? Need some good tips on how to manage your team? This article is inspired by Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

Discover Our Most Popular Podcasts
Join the Dev Interrupted discord

Over the last ten years, technology has become more sophisticated. Faster. Smaller. More powerful. But it isn’t just our technology that’s evolving at a rapid pace. Our culture, attitudes and politics are all changing, too.

So what could the next ten years look like? How might businesses change to keep up with technology? We spoke with Jason Warner, managing director at Redpoint Ventures, to get his thoughts on the matter.

“Ten years is an interestingly long, but also short time horizon,” Jason explained. “It’s likely we’ll see a complete company cycle, maybe two macroeconomic cycles.” -On the Dev Interrupted Podcast at 10:29

1. Organizations will invest more in compliance and security

There have been a lot of large changes in recent years. People are working from home. Political tensions are high. And almost every device collects data about us. In all these cases, security is important. Securing our businesses, our national secrets, and our private lives.

It all leads to an inevitable conclusion. Jason believes that chief compliance officers will become commonplace, even in small companies. Protecting data is going to become a primary concern, for governments, businesses and people. Because, as the world gets more digital, we’re going to see more and more cyber attacks.

“Trends that I see happening are an increased awareness and investment in things like compliance and security. I think that if companies don’t have a chief compliance officer now, they likely will in the future,” Jason said. “I think it’s interesting when you see the geopolitical environment of how we might have to invest in more sophisticated tooling for national security. But more than that, it’s like understanding that we’re no longer a single micro-geo unit called the United States.” - On the Dev Interrupted Podcast at 11:03

2. Companies will focus on loyalty and subscriptions over one-off sales

The standard business model is outdated. In the past, technology companies sold software, they gave customers the software and that was the end of the transaction. But now, it’s more about building communities and regular interaction with your customers. It’s about subscriptions, regular payments or even donation models, seen on popular platforms like Twitch. Software isn’t a product any more. It’s a service.

But almost every company these days is a technology company. Just look at what’s happened to the taxi industry. The model has completely changed, simply because the technology has evolved. The old model won’t completely disappear, but we’ll see more and more industries move into a subscription model as new technology takes over.

“Selling is about adoption first and selling second. Someone’s got to reach for you first,” Jason explained. “Then, they’re going to find a value problem, then they’re going to want to give you money if they’re finding utility out of you.” On the Dev Interrupted Podcast at 11:21

3. Hardware is, and always will be, just as important as software

With every new innovation, we place more demands on the hardware we’re using. The more advanced our software becomes, the more powerful our hardware must be. But right now, most  companies rely on international trade to build key components. With tensions rising, it’s likely that we’ll see companies begin to bring these resources closer to home, securing their supply chain in the process.

“There’s interestingly a lot more emphasis on investing in hardware again,” Jason said. “And America in particular owning its hardware manufacturing, which I think is obviously good.” -On the Dev Interrupted Podcast 11:41

Watch the full interview

If you’re interested in what else Jason had to say about the next ten years, and what challenges society faces, you can watch the full podcast on our site.

Starved for top-level software engineering content? Need some good tips on how to manage your team? This article is inspired by Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

At Netflix, we don’t just think about productivity - we engineer it. There’s an entire team within Netflix dedicated to productivity. I lead the Develop Domain along with my Delivery and Observability Domain peers, and together, we make up Productivity Engineering.

I recently sat down with the Dev Interrupted podcast to discuss all things productivity, how I run my team, and how other managers should view employee success. Here’s how we think about it at Netflix:

Can productivity be engineered?

In short, yes! Productivity is not a generic term for team performance or a perfunctory buzzword used during team meetings. The productivity team is an actual organization. The work we do is foundational to Netflix’s development teams. Productivity Engineering lives within the broader, central Platform organization.

The role of the Productivity Engineering team is simple: we exist to make the lives of Netflix developers easier. Abstracting away the various “Netflix-isms” around development, delivery, and observability, productivity allows devs more time to focus on their domain of expertise. 

“We are sort of like the nerds’ nerds, if you will, enabling them to use our platforms and tools so that the work that they're doing is focused on studio and streaming, without thinking about everything that's under the hood.” - On the Dev Interrupted Podcast at 2:31

With the recent addition of Gaming to the list of Netflix’s pursuits, the resulting focus becomes even more important.

Practically speaking, it’s the role of Productivity Engineering to help with things like coding, testing, debugging, dependency management, deployment, alerting, monitoring, performance, incident response, to name a bunch. Netflix utilizes the concept of a “paved road,” the frameworks, platforms, apps, and tools we build and support to keep our devs rolling. The idea is to keep workflows streamlined and enable developers to operate as efficiently and effectively as possible. If the road ahead is cleared of obstacles, you’re going to get to where you need to go faster and with support along the way. 

It’s also about helping developers enjoy the ride. To abuse another metaphor, a sound engineering experience should be like dining at a fine restaurant. If done right, you rarely remember the waitstaff, have a hard time finding something you like, or worry about how they prepared the food; you simply enjoy the experience. If Productivity Engineering is doing their job, they act as the restaurant and waitstaff with developers as the customer, providing nothing short of a beautiful end-to-end experience. 

Measuring Outcomes vs. Output

Measuring all of that productivity can be hard, and there’s no one unicorn measurement to rule them all. Hence, developer productivity teams should focus on impact and outcomes. Above all, Netflix focuses on customer satisfaction. Our philosophy is that while how something is delivered is important, the impact of what’s delivered is ultimately of greater importance. 

"If you're running around a track super-fast, but you're on the wrong track, does it matter? So really, what are you delivering? How you're delivering is important. But if that thing that you're delivering is ultimately doing what you want it to do, that's the most important thing." - On the Dev Interrupted Podcast at 5:05

In this model, the outcome always wins over output or activity. For instance, standard productivity deployment metrics (DORA) as applied to our customers become an important proxy for measuring our success. Key Performance Indicators (KPIs) for productivity are viewed as a reflection of a team’s performance as it relates to customer satisfaction.

I’m a big fan of the SPACE framework, developed by Nicole Forsgren, for precisely this reason. How are our customers doing in terms of Satisfaction, Performance, Activity, Communication, and Efficiency? The answer to those questions reflects how we’re doing as a Productivity organization.

"This is our strategy, these are our hypotheses around, how we're going to improve our customers' productivity. Are those things paying off? And if you can't measure them in some way, who knows? Right? So yeah, we're getting a little more hardcore about this." - On the Dev Interrupted Podcast at 24:17

Key metrics provide productivity teams with a holistic view of performance by establishing benchmarks. Understanding that everything needs to be viewed within the proper context, it’s difficult to improve as an organization if nothing is measured or tracked. 

Comparing Productivity 

Comparing developers’ productivity across teams is a thorny subject at best and downright dangerous for team morale at worst. As the old saying goes, “Comparison is the thief of joy” or what I typically say, “comparisons lead to unhappiness”, or with my kids “eyes on your own paper!”. 

The productivity teams at Netflix take a contextualized view of dev teams rather than relying solely on raw data. Every project is different, the customer base is different, the use case is different, personas are different, and where a team is within the software development life cycle is different.

It’s a basic understanding that comparing apples to oranges is not good math. A team that is just starting out and building something new, is going to look very different than a team with a mature product. By recognizing this, it becomes almost impossible to rank teams against each other because very rarely, if ever, will teams be doing the same thing, in the same space, the same way, with the same people. 

Even a measurement of an outcome pertaining to customer satisfaction (CSAT) is not straightforward. At Netflix and across the industry, we’ve found that satisfaction for internal teams skews lower than satisfaction for customer-facing teams.

The reason? Teams within Netflix are their own harshest critics. When attempting to gauge the performance of an internal team vs a customer-facing team, it’s understood that the internal team is almost always going to score lower on satisfaction, even if both teams are equally effective. 

Context is everything. Measuring productivity means being mindful of context. 

Pushing Productivity 

Any company that wants to be successful must understand how to measure its success. Productivity doesn’t count for much if an organization is not moving towards desired outcomes. 

By viewing productivity as more than just a concept or a raw set of data, the hard-working teams at Netflix have turned productivity into an actual apparatus. It is a living, breathing team of human beings whose devotion to empathetic efficiency improves customer satisfaction and dev team quality of life. I am incredibly proud to lead these teams, and I sincerely hope the work we do inspires other organizations to improve their developers’ experience.

And if you want to be as productive as Netflix, remember that metrics are only as good as their context! 


If you enjoyed this article and you would like to learn more about the work that I do at Netflix, I invite you to come join me at INTERACT on April 7th

This will be the second time that I have sat down for a panel discussion hosted by Dev Interrupted. I love being a member of the Dev Interrupted community because they are such an amazing resource. If you are a team lead, engineering manager, VP or CTO looking to improve your team, come to INTERACT and check out the community - I promise you will learn something.

Pretend you are watching your favorite show on Netflix: Sit back, relax & watch as I share the stage with other amazing engineering leaders from places like Slack, Stack Overflow, American Express, Outsystems, Drata & many more.

>Register Here<

Chaos Engineering might sound like a buzzword - but take it from someone who used to joke his job title was Chief Chaos Engineer (more on that later) it is much more than buzz or a passing fad - it’s a practice. 

The world can be a scary place and more and more companies are beginning to turn to Chaos Engineering to proactively poke and prod their systems and in doing so are improving their reliability and guarding against unexpected failures in production and unplanned downtime. 

During my career I dealt with my fair share of outages, including one that caught me mid-song during a bout of karaoke and far too many that woke me up at 02:00. As the co-founder and CTO at Gremlin, I do my best to make sure no other engineers have to suffer sleepless nights worrying about their product. 

But the question remains, what is Chaos Engineering and where did it come from?

A Short History

The spiritual predecessor to Chaos Engineering is often called by a much more widely recognized name - disaster recovery. The focus when this practice was introduced is much the same as today: proactively suss out production problems by injecting failure. 

Netflix’s Chaos Monkey is probably the most well publicized Chaos Engineering tool as it arguably kickstarted the adoption of Chaos Engineering outside of large companies, but this has led to the erroneous belief that Netflix invented the practice. In fact, the practice was already widely in use amongst the titans of technology. 

Over a decade ago during my time as a Lead Software Engineer at Amazon, we implemented several crude practices designed to inject failure into our systems. The most rudimentary of which was employed by a man called Jesse Robbins, who earned the nickname “Master of Disaster” by running through data centers pulling out cables. 

Let’s just say the practice has evolved a lot since those early days and your data center cables are much safer these days.

What is Chaos Engineering?

“What Chaos Engineering really is, is the art, if you want to call it that, of introducing controlled chaos.” - 2:16 on the Dev Interrupted podcast

At its core, Chaos Engineering is a disciplined approach of identifying potential failures before they have an opportunity to become customer facing outages. 

It is a practice that lets you safely test your assumption about how your systems will behave under duress by actually exercising resilient mechanisms in a controlled fashion. You literally "break things on purpose" to validate and build resiliency. The end goal of Chaos Engineering is not to inject arbitrary failure into a system, but rather to strategically inject turbulence to enhance the stability and resiliency of your systems.

How Chaotic is Chaos Engineering?

I always tell people that Chaos Engineering is a bit of a misnomer because it’s actually as far from chaotic as you can get. When performed correctly everything is in control of the operator. That mentality is the reason our core product principles at Gremlin are: safety, simplicity and security. True chaos can be daunting and can cause harm. But controlled chaos fosters confidence in the resilience of systems and allows for operators to sleep a little easier knowing they’ve tested their assumptions. After all, the laws of entropy guarantee the world will consistently keep throwing randomness at you and your systems. You shouldn’t have to help with that.

How do I Start?

One of the most common questions I receive is: “I want to get started with Chaos Engineering, where do I begin?” There is no one size fits all answer unfortunately. You could start by validating your observability tooling, ensuring auto-scaling works, testing failover conditions, or one of a myriad of other use cases. The one thing that does apply across all of these use cases is start slow, but do not be slow to start.

What I mean by this is to start testing across just a few nodes versus impacting your entire fleet. We refer to the impacted area as the “blast radius” and we highly recommend starting with a small blast radius (the number of systems impacted) and increasing it over time.

By starting small you allow yourself to gain confidence in both the experiments you are running and your systems. Of course your risk tolerance is also a factor of how large a blast radius your organization will use. 

For instance, a large banking institution with millions of customers has a much lower risk tolerance than a tech startup with a couple hundred customers. In that case, they would want to run experiments in a programmatic way and would need to be very explicit about communicating to the rest of the organization what tests are going to be run and when to avoid any unplanned 2am or 3am disasters. 

Eventually you want to get to the point where all of this is automated, a process we refer to as “continuous chaos.” Starting small with automation could be something as simple as taking out a single node; then taking out five nodes; then ten; and so on. Eventually you automate the process at a level you are comfortable with.  

“Ultimately you want to be able to handle any of this random chaos being thrown at you, because that's what the world is, it's entropy, it's degradation” - 7:35 on the Dev Interrupted podcast

No Tolerance for Downtime

When I founded Gremlin, it was just myself and my co-founder developing the first iteration of the product. The business looked very different then and I jokingly referred to myself as the “Chief Chaos Engineer” responsible for implementing code that was mostly used by enterprise companies. Many of these companies came to us because they had reliance thrust upon them by the US government or they had top-down reliability standards and they wanted a tool to help them shore up their systems. 

As the company began to evolve, so did the customer base. These days it’s not just Fortune 500 companies that care about reliability, it’s everybody. Planned downtime is a relic of days gone by. It is no longer acceptable to espouse planned maintenance windows as part of development lifecycles and customers don’t have the patience for products they rely upon to spend any time unavailable. Companies recognize this dynamic - and it’s not a hard one to miss. 

Seemingly our appetite for technology has gone up exponentially while our ability to stomach downtime has drastically decreased. Customers expect that your product is always working, always running. If your product is down because of outages then there are ten other similar products waiting in the wings to take their money. 

Making Lives Better

Visibility is high these days and companies don’t need the publicity that comes with making any unforced errors, let alone to be subject to errors not of their making. No one wants to be blown up on Twitter because their product isn’t working or because one of their downstream dependencies or their cloud provider had an unexpected outage. 

By preparing for the worst, we can be at our best as an industry and can be prepared when disaster eventually comes knocking. That’s why when an unexpected outage occurs or there is a production failure customers will never even know it happened. 

I often joke that we are the engineers’ engineers because many of us know that feeling of being jolted from a dream at 03:00 by our pagers, groggily wiping our eyes and whipping out the laptop to go dig through a sea of monitoring dashboards and logs. It’s not fun and it’s exactly why I founded Gremlin. Because there is a better way to approach operations than merely sitting back on our haunches and waiting for the next outage. Chaos Engineering not only helps to protect against the randomness of the world, but also teaches people how to build more reliable software. And if enough people build more reliable software, we build a more reliable internet.

_____________________

Starved for top-level software engineering content? Need some good tips on how to manage your team? This article is inspired by Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

A good SRE engineer will tell you your service is never down. A great SRE engineer will tell you that’s not what you should be measuring. In fact, they’ll tell you their job is customer service. 

Site Reliability Engineering (SRE) has grown immensely popular with many of the world’s largest tech companies, like Netflix, LinkedIn and Airbnb employing SRE teams to keep their systems reliable and scalable.

Along the way, SRE engineers have become one of the most sought after engineering roles in tech. 

The role is traditionally understood as ensuring that services are reliable and unbroken, but reliability and uptime aren’t perfect metrics. Perhaps what organizations should be asking themselves is what their customers think of their service. 

Wandering down to your engineering department and asking your SRE team about customer satisfaction is a good place to start. 

Their answer just might surprise you. 

History of SRE

In practice, Site Reliability Engineering has been around for a while. In the past its functions were covered by roles that had names like production ops, disaster recovery, testing or monitoring. The rise of cloud computing facilitated a need for more engineers in production. The complexity only grew as more organizations transitioned from monolithic infrastructures to distributed microservices. 

Modern Site Reliability Engineering originated at Google in 2003 with the work of Benjamin Treynor, who is seen as the “father” of what we now simply call SRE. Treynor, who coined the term, was a software engineer placed in charge of running a production team. With the goal of making Google’s website as reliable and serviceable as possible, he asked that his team spend half their time on operations tasks so they could better understand software in production. This team would become the first-ever SRE team.

Ben Treynor said, I'm paraphrasing, ‘[SRE] is essentially like throwing a software engineer at an operations problem’, right? Because you come from that developer mindset, that design and, you know, you think about all of these things. So think about it as a developer but apply it to an operational type of problem.” - Brian Murphy on the Dev Interrupted podcast at 4:26

Why not uptime?

So why shouldn't you be too concerned about your uptime metrics? In reality SRE can mean different things to different teams but at its core, it’s about making sure your service is reliable. After all, it’s right there in the name. 

Because of this many people assume that uptime is the most valuable metric for SRE teams. That is flawed logic. 

For instance, an app can be “up” but if it’s incredibly slow or its users don’t find it to be practically useful, then the app might as well be down. Simply keeping the lights on isn’t good enough and uptime alone doesn’t take into account things like degradation or if your site’s pages aren’t loading. 

It may sound counterintuitive, but SRE teams are in the customer service business. Customer happiness is the most important metric to pay attention to. If your service is running well and your customers are happy, then your SRE team is doing a good job. If your service is up and your customers aren’t happy, then your SRE team needs to reevaluate.

A more holistic approach is to view your service in terms of health. 

The Four Golden Signals

As defined by Google, these are the four golden signals of SRE. If these can be managed effectively, then you probably have a healthy system. 

Establishing system health

“The best way to get started is just measuring stuff, you know, just getting the baseline of what's healthy, what's not healthy, what looks like health, and then you can start working from there.” - Brian Murphy on the Dev Interrupted podcast at 10:49

It can be difficult to know whether or not your organization should consider forming an SRE team, or what your next steps are if you’ve already made the decision. 

Again, think of your decision in terms of a holistic approach, not just your uptime. If you have high uptime, that’s fantastic, but what you should be establishing is a benchmark. 

Using the four golden signals to guide you, establish what you think a healthy system should look like and set your benchmark. Keep measuring over time and you will begin to see the areas that are good or require more work. 

These measures will help inform all of your future decisions. Perhaps your organization is ready to roll out new features or make choices around expanding your service. 

Critically, the health you establish provides insights into customer happiness. If things look good you probably have happy customers. 

Internal customers

When done right SREs aren’t just making customers happy, they’re making the lives of developers easier too. Nothing is worse than having to stop because there’s a problem in production. Good SRE teams can shield dev teams by focusing on major hotspots.

If the fires are being managed before they are out of control, it allows developers to keep pushing out features. It even gives them the freedom to keep breaking things, if necessary!

When things do break, or require a slowdown, a dialogue can occur. A good SRE understands that the developer who wrote a piece of code understands it better than anyone. The model for good internal customer service is an SRE who brings in a developer, gives them ownership of the code they created, and offers to help them fix it.

Happy customers are the best customers

Whether you already have an SRE team or are thinking about forming one, remember to think beyond the engineering - think about the customer. 

Ask yourself if your customers are happy and if you would describe your service as healthy. Remember to think about your own teams as well, your developers will thank you for it. 

_____________________

Starved for top-level software engineering content? Need some good tips on how to manage your team? This article is based on an episode of Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

 

Continuous Delivery isn’t about how fast you can deliver, it’s about the outcome your delivery achieves. Bryan Finster, author of the 5-minute DevOps series and founder of the DevOps Dojo, joined our Dev Interrupted Discord community to answer your questions about outcome-based development, continuous delivery, and why failing small is better than failing fast. 

Bryan is currently a Distinguished Engineer at Defense Unicorns but has also worked for Walmart as a systems analyst and eventually became a staff software engineer for Walmart Labs. He had previously appeared on the Dev Interrupted Podcast to further talk about these subjects as well as the most common pitfalls dev teams find when trying to optimize their delivery process. Listen to the episode here:

This Community AMA took place on January 8, 2021 on the Dev Interrupted Discord.

Necco-LB: 📢📢 Community AMA📢📢   @everyone 

Topic: Outcome-based Development with @BryanF (Bryan Finster)

Bryan, thanks for joining us today!

Bryan Finster: Thanks for having me!

col: Bryan... great quote. "A developer is a business expert who solves problems with code." Thank you. Tremendous concept.

Bryan Finster: Thanks. That's who we are. We aren't Java spewing legos. If we don't understand the business, the code won't.

Rocco Seyboth: YES!! @col Love it. @oriker says "a business decision is made with every line of code"

Bryan: Exactly. How does this change improve the bottom line. Even more, how does it improve the lives of our customers?

Necco-LB: We really enjoyed having you on the podcast to talk about Outcome-based development and what continuous delivery should be trying to achieve. I was hoping you could explain to use what Outcome-based development means?

Bryan: It's just focusing on the outcomes. It's pointless to focus on how we do things if the outcomes are poor. It's also about Hypothesis Driven Development. The act of defining the expected value before we attempt to deliver it and then measuring for that value. Instrumenting the application to see how close we get so we can adjust. I frequently see people just being feature factories, pounding out changes that no one needs. That just costs money and increases support. We should be deliberate about what we do and say "no" when the value isn't obvious.

Cocco: When it comes to delivering value to the customer sooner, what things do you commonly see teams worrying about that they perhaps shouldn't (or not worry about, when they should?)

Bryan: "I can't release this! It's not feature complete!" No, get the incomplete change out there and make sure it doesn't break anything.

Necco-LB: You mentioned during the podcast that Pride is the best metric ever. Can you explain that a little bit?

Bryan: If I own the business problem, own the solution, own how to make it better, own the outcomes and see people getting value from my work, then I have pride in what I do. I want it to be good. I want it to be secure and stable and I want to continuously improve it.

Necco-LB: When you talk about outcome-based development you often talk about the things that need to happen before hands touch the keyboard. What are some of those things?

Bryan: We need to understand the value we are trying to deliver and we need to define how we expect to deliver that value at the detail level. It's not enough to write a vague user story. We need testable outcomes that we agree should deliver that value. Behavior Driven Development is the most effective tool I've found for that. We also need to make sure we aren't trying to deliver ALL of the value at once. What if we are wrong? We usually are, statistically. So, what is the smallest, highest value thing we can deliver to find out? Sometimes the right answer is to stop at that point. Invest in the outcomes, not the plan or the work.

_____________________

Read the unedited AMA and join in the discussion in the Dev Interrupted Discord here! With over 2000 members, the Dev Interrupted Discord Community is the best place for Engineering Leaders to engage in daily conversation. Join the community >>

Dev Interrupted Discord, the new faces of engineering leadership

_____________________

Cocco: What patterns/trends do you see in teams who can deliver the outcomes they want? (Are there common factors in teams you've seen that move from struggling -> successful?)

Bryan: Yes. Actual continuous delivery and product ownership. They can deliver small changes daily and they have ownership of what those changes are. They have the safety to challenge things without fear and they are not pushed so hard that there is no time to think of better ideas. Software development is a mental activity, not typing.

Necco-LB: You work with a lot of different teams at the DevOps Dojo. What are some of the most common pitfalls preventing a team from optimizing their delivery process?

Bryan: They are given the wrong problems to solve. They are asked to solve stupid problems like "how many changes did you make today?", "How many stories did you complete this sprint?", They don't know how to work as teams because they are incentivized to work in silos. So, requirements are poorly defined, testing suffers, speed suffers. They need to be solving the business problem. What is measured will change. Be careful what and how you measure.

Necco-LB: What are some first steps a team can take if they want to become more outcome focused?

Bryan: Focus on the business problem and get close to the user. Empathize with them and what value they need. This really applies to anything. If you don't respect your customer, you won't need to worry about them for very long.

Necco-LB: What is the role/responsibility of the developer in this outcome-based development model?

Bryan: On a good development team you have engineers and product ownership. Engineers ship working solutions. They know they are working because they tested them, delivered, them and observed that their tests were accurate.

Rocco Seyboth: In 5 Minute DevOps you talk about observing what high performing teams do then modeling other teams to the same process and behavior... how do you reconcile that with the belief that every team is different and should have the flexibility to do things their own way?

Bryan: Actually, I advocate against cookie cutter templating of teams in that post. We should standardize on improving outcomes.

Necco-LB: Friends, that's just about the top of the hour. Bryan has a real job that needs to get done, but feel free to keep the questions coming asynchronously throughout the day - he'll be popping in and out to answer them. Bryan - thank you so much for joining our community today and answering our questions!

Bryan: Just some contact links to leave and I want to thank everyone for the conversation. I love talking about these topics.
https://www.linkedin.com/in/bryan-finster/

https://bdfinst.medium.com/

_____________________

Starved for top-level software engineering content? Need some good tips on how to manage your team? This AMA is based on an episode of Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

Following a recent interview on the Dev Interrupted Podcast, OutSystems CEO and founder Paulo Rosado joined us to chat about his path to founding the company, advice for successful leaders, and the growing threat of technical debt. The conversation below has been edited for length and clarity. 

_____________________

Tell us about OutSystems' founding story. What inspired you to start the company?

In February 2021, OutSystems was valued at $9.5 billion dollars - but it certainly didn’t start out that way. The idea behind OutSystems was decades in the making, and its mission stems from what I observed after moving to Silicon Valley back in the mid-nineties. 

My journey in technology began when I graduated with a degree in computer engineering from Universidade Nova de Lisboa in Lisbon, Portugal and moved to the US to get my Masters in Computer Science from Stanford. Afterward, while working in Silicon Valley, I began to understand just how much of a problem technical debt was. 

While working on a very large engineering team, we were faced with tackling a gigantic project in Java and I realized the issues of releasing and maintaining code sustainably. The lack of productivity in the software development process was appalling. Fixing this problem is ultimately what motivated me to found OutSystems. 

Before founding OutSystems, there was a small company I founded and later sold, which focused on internet and intranet projects. It wasn’t a bad company, but we kept failing. Projects were never delivered on time or on budget. 

We would think to ourselves, “We’re smart. How is this possible?” Our inclination was to blame the requirements of the project, labeling the scope as incorrect and adjust from there. However, we began to realize that the companies hiring us for these projects wanted us to make changes as we were developing in response to rapidly changing environments. 

The issue we began to face was the continual accumulation of technical debt. We would reach first production and realize we had built something users didn’t want, requiring us to go back and rework the stuff we had just built. 

“We came up with this realization that the problem was not that the requirements up front were wrong. The problem was that the cost of changing wrong requirements, which are a fact of life, is very high.” - on the Dev Interrupted podcast at 6:03

 

This phenomenon was occurring in 90% of projects at the time. Things were always over budget and always late. 

Today, it’s easy to take this for granted because concepts like Agile, DevOps, CI/CD are mainstream. But at the time, you had to build software the same way you build a bridge.  

Why is technical debt a challenge for companies now? How has this problem changed?

Technical debt has become a large problem for businesses, and one that only compounds with time. Tech debt doesn’t have a singular cause - it’s the accumulation of several factors. 

Over the course of my career, I’ve seen first-hand the complexity brought about by the evolution of software development. For instance, we’ve seen an explosion of languages, paradigms and frameworks that can all be used to achieve a solution. Often these languages are dispersed with no connections between them, so tracking these dependencies requires a great deal of sophistication. 

In addition to this, turnover within the development team is a critical problem that leads to technical debt. The moment a company loses a developer, the knowledge accrued by that developer also departs the company. The hole left behind is complex, including code,  frameworks and intent behind how their systems are structured. 

It’s been my experience that a lost team member can take as much as 20% to 30% of the fundamental knowledge of a system with them. Reverse engineering their work is both time-intensive and inefficient. 

Companies have tried to corral this problem by investing in coding standards. While these constraints can help mitigate the loss of a valued developer, our research indicates turnover remains a significant problem. 

OutSystems recently released a study on the effects of technical debt. What were its findings? 

Recently, OutSystems surveyed 500 large companies around the world to examine the cost of technical debt facing businesses and uncover the challenges companies face as they confront its causes. The results from the companies surveyed were many of the same things I’ve observed throughout my career. 

It’s important to note that while the causes of technical debt have largely remained the same, the pace at which technical debt occurs has grown substantially.

And so it's a hack, right? What we call a hack at OutSystems, they did a hack to just release the software quickly. And those hacks compound into technical debt.” - on the Dev Interrupted podcast at 27:11

The survey we conducted isolated three major causes of technical debt. They are as follows: 

  1. The amount of developer frameworks. An increase in frameworks leads to an increase in technical debt. 
  2. Developer erosion. Employees leaving an organization and taking legacy knowledge with them. 
  3. Compromises in quality of architecture and code. Often caused by a shortsighted view that what needs to be done now is more important than long-term stability of the codebase.

In the past, companies believed they could buy their way out of this problem, but that strategy has proven ineffective. The reality is, the most successful companies must build the software they require to meet their business needs. 

Simply purchasing what you need doesn’t solve your problems because even purchased systems must be cobbled together, requiring unique API’s, unique UI’s, unique portals, and unique mobile applications. 

Does OutSystems play a role in helping companies cut tech debt? 

The core of what we do at OutSystems is focused on tackling those three fundamental problems. We understand that technical debt amasses slowly over time, through a myriad of decisions that appear much smaller at their onset than their totality would suggest. Once these “tiny” decisions become a major problem, they inhibit investment in current operations and future innovations. 

The increasing pressures of today’s fast-paced business environment often push companies toward decisions that spiral into technical debt. The good news is that by creating a development process that marries short-term deadlines with long-term strategic goals, it’s possible to “pay down” that debt. 

I believe that any company is capable of whittling away technical debt with the correct tools and processes, and I founded OutSystems because companies shouldn’t have to choose between building fast and building right. 

To learn more about technical debt, how to combat it, and what to expect in the future, you can download the 2021 Technical Debt Report on our website.  

_____________________

Starved for top-level software engineering content? Need some good tips on how to manage your team? This article is based on an episode of Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

Dan is the founder of Tellspin, an on-call scheduler in Slack for DevOps and developers (https://tellspin.app). Helping workspaces reduce their contact footprint, resolve incidents faster, and regain deep focus.

Code smell is a way to describe code that hasn’t aged well and has the potential for a lot of issues.

It usually is the source of a lot of hot fixes or workarounds keeping it functional. My most common reflex is to rewrite it. However, if I’m not careful, I’ll waste an entire day and not improve anything.

After a decade of programming, here are my 7 steps to reduce code smell gradually.

Step 0: Admit there is a problem

I start to recognize my code is smelly when I start saying things like “that time only took an hour.”

I’m usually doing something simple, like adding another field to a form or another schedule for a customer. I quickly add in code because it feels like the easiest thing to do and ship the feature. There are so many other things on my plate, I don’t have time for this, I’ll say to myself.

By the 5th or 6th hour I’ve hacked the same spot, I realize, had I rewritten it sooner, I would have actually saved time. 

Step 1: Identify spots to clean

Smelly code is so disorganized.

Is it really smelly or do I just not understand it? It’s very tempting to always default to a rewrite. If I write all the code, I’ll understand it. But who is to say the next person who looks at it will?

Similar to profiling code to identify the slowest spot, I work to identify the place that smells the most. Are there sections of the code that new devs are always struggling with? Are there frequent small changes that require touching lots of different files or methods?

Creating a list of smelly code helps identify which sections of code need the most attention.

Step 2: Pick the worst spot

Smelly code is like dirty dishes.

With a stack of dishes, I’ll plug my nose until I dispose of the rotting food that’s causing the stink. It was easy to blame the whole pile, but for the most part, all of the other dishes are fairly clean. They don’t need immediate attention. The rotting smell came from something I forgot to clean off when I was in a hurry.

When there is a piece of code that’s really rotten, it’s often hidden somewhere in the pile. Maybe an abstraction went too far, spreading a hundred lines of code across dozens of files.

I keep in mind that I need to fix the worst smell; most of the other code is good enough and doesn’t need my immediate attention. 

Step 3: Resist the urge to do everything

Smelly code is never-ending.

Perhaps the hardest part of improving a code base is scoping it to one thing. It’s so liberating to finally get a chance to clean up, that I can easily take it too far. I’ll think, “While I’m at it, I might as well clean up this… oh! and that other thing needs fixing too.” 

Resist! Do not do everything. 

If I try to tackle everything, I’m not going to finish. Even more likely, it’s not going to pass code review. It’s better to do one piece at a time - ya know, like eating an elephant. 

Step 4: Make sure it’s better

Smelly code has edge cases.

Inevitably, in the process of rewriting, I discover why the code was written that way in the first place. I might even stumble across a can of worms. At that point, I realize my not-so-dimwitted co-worker wasn’t as dumb as I thought (or even more likely, I discover I was the one who wrote the code originally 🤦‍♂️).

 After learning all the edge cases, I’ll be tempted to walk away.

Step 5: Don’t immediately give up

Smelly code is messy to work with.

I’m frustrated imagining how far away the current code is from a better solution. I’ve got the code in my head, I know the edge cases, and I’ve got the context. It’s important not to give up as the solution may be right around the corner.

I keep thinking about it while I go for a walk. Maybe even take a break. Solutions often come to me while I’m on walks or in the shower.

Step 6: Use the co-worker bobblehead

Smelly code needs attention.

I steal my co-worker’s bobblehead and explain aloud what I’m doing. In the process, I figure out what I've missed or overlooked. 

If a bobble head isn’t available, I resort to using my actual co-workers. (I’m checking my assumptions by walking them through what I’m thinking step by step.)

Step 7: Publish or throw in the towel

Smelly code can improve.

At the end of my steps I have a complete solution or I’m banging my head on the keyboard. If it’s the first, I push the change and take a breath of fresh air. If it’s the second, I commit it to a branch and plan to revisit another day. Sometimes we can’t have nice things.

Rinse and repeat

The depth I go into each step changes based on complexity or how critical the code is. Sometimes I can run through each of the steps in a few minutes, other times it’s spread out over a few weeks. It really depends on what I’m working on.

Running through these steps helps me gradually improve my code. There’s nothing better than finally getting a fix for some smelly code merged and into production. Sometimes we can have nice things.

Dan Willoughby is the founder of Tellspin, an on-call scheduler in Slack for DevOps and developers (https://tellspin.app). Helping workspaces reduce their contact footprint, resolve incidents faster, and regain deep focus.

_____________________

Starved for top-level software engineering content? Need some good tips on how to manage your team? This article is inspired by Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

 

 

Presenting a Dev Interrupted Community AMA - Adam Furtado - Chief of Platform at Kessel Run - Answering your questions about scaling Kessel Run

How will the wars of the future be fought, and who is heading these advancements in technology? Back in 2017, the US Air Force created a program called Kessel Run, which aids war fighters in the realms of DevOps, Agile, and UX, and the head of this project was an analyst by the name of Adam Furtado. In February of 2021, we interviewed Adam on the Dev Interrupted Podcast and shortly afterward hosted an AMA on our community Discord server.

Adam is the Chief of Platform at Kessel Run, and his story of how he almost single handedly led the US Air Force from 1970's software delivery methods to modern DevOps is one of the most incredible episodes of Dev Interrupted we've had. Adam talks about translating engineering to military officials and how he had to shift his mindset from application development to creating a system of systems. Listen to the episode here: 

This Community AMA took place on February 26, 2021 on the Dev Interrupted Discord.

Necco-LB: 📢 📢 Community AMA📢📢   @everyone

Topic: Scaling Agile & DevOps

We're getting started in 15-minutes! Adam Furtado joins us to share his experience and expertise in scaling his organization (Kessel Run) from 5 >> 200+ developers!

Necco-LB: Let's get this thing started! @here

Welcome to our little community Adam! I can honestly say your episode of Dev Interrupted this week was one of the most interesting episodes I've produced.

Adam Furtado: Thanks for having me! I'm happy to hear that.  Fighter jets are inherently cool.

Necco-LB: I don't think anyone can argue with that. To start things off, Adam can you give the community some quick context about Kessel Run? How many developers in your organization, what you’re building, etc.

Adam: Sure thing, KR is an Air Force organization proving that government-led software development will lead to better mission outcomes than outsourcing our software to companies that specialize in building airplanes (and using the same processes for their software).   We build applications for warfighters to more efficiently strategize, plan, execute, task and assess the complexities of air campaigns. We have grown to about 1300 people… I’d guess. 400 of those are developers.

luisfernandezbr: Adam. What are the top 5 tech/dev metrics that you consider important to measure on a dev team? (Not product metrics like MAU, MRR).

Adam: I think they change as an organization changes... but for the most part I love the DORA 4... I think when used properly (and together!) it can tell you quite a bit about where you need to invest in your organization.  The relationships between the metrics are what drives the value and I think often get forgotten about.

Necco-LB: Were you looking at different things (metrics or ways of visualizing work) when KR was smaller vs today?

Adam: For sure. I led most of our app development at first and we were/are an XP shop.  Our teams were always very diligent about pulling the next story from the top of the backlog etc., so we never really had a WIP problem.  When I moved over to lead our platform org, they were using a poorly-executed pseudo-scrum model and all of a sudden all of the DeGrandis/Kersten/Kim stuff I have been reading my whole career started to make a ton more sense.  In building internal services, it was amazing to be able to see why work visualization matters and SEE the constraints building up.  I'm so glad that I made the switch to build the empathy needed to be a more effective leader.

Necco-LB: Sounds like a big jump indeed. I have to say I’m wicked curious about how software development is different from within the military vs. the corporate environments most of us know.

Adam: Traditionally, the DoD was a case study in poor waterfall dev.  Years of requirement development by people very removed from the work, leading to a contract being put in place that could only feasibly be won by a big defense contractor, years of development to "deliver" the "finished" software to be tested by separate government test organizations for a year or so and then "fielded" manually by folks traveling around the world putting CDs in machines. We've proven that all that risk avoidance actually INCREASES risk and we've had it backwards all along.  To biggest thing we focused on early was how to reach Continuous Delivery with the heavy GRC requirements that we have in Defense (and rightfully so).  So we worked with a forward-leaning IT leader in the Air Force to create and pilot the first Continuous Authority to Operate in the DoD.  So instead of an approval to deploy to classified systems at the "end", we got our processes approved so everything coming out of our org was approved to go into those production environments.  That’s prob the most unique thing.

luisfernandezbr: How you measure the evolution of your dev teams? And what initiatives and practices you use to grow them (like Dojo's etc)? What content do you recommend about DeGrandis/Kersten/Kim?

Adam: Deploy Frequency, Lead Time, Mean Time to Restore and Change/Fail Rate... Accelerate is the bible on this one (Forsgren, Humble, Kim)... Phoenix and Unicorn Project for Gene Kim's take on how to transform IT to DevOps approaches in big, slow companies... Making Work Visible by Domenica DeGrandis is a fantastic book on understanding what keeps us for being as productive as possible.... Mik Kersten's Project to Product on increasing flow.  There are a ton of others, but that's a good start.

_____________________

Read the unedited AMA and join in the discussion in the Dev Interrupted Discord here! With over 2000 members, the Dev Interrupted Discord Community is the best place for Engineering Leaders to engage in daily conversation. Join the community >>

Dev Interrupted Discord, the new faces of engineering leadership

_____________________

drdwilcox: Thanks for joining us. During the podcast episode you talked about gaining momentum with some early wins. How did you keep that momentum going?

Adam: We struggled there, to be honest. The early wins were so much easier to "sell" to stakeholders.  "There wasn't an app before... now there is.  So I'm impressed". The first year we were deploying MVPs left and right and were the Belle of the ball.   However in building large scale systems, we started focusing a lot more on our infrastructure, data models, optimization, internally efficiencies... and things that were providing real value- but weren't as visible.  Those things are a lot less interesting to stakeholders. The Government has a very output-centric approach to value.  We have focused on building an outcome-driven organization, so there is always a conflict when discussing what is or isn't valuable.

drdwilcox: I don't think it's just the government, to be honest. I have the same struggles in my private company. Output as defined by Product are sexy, all the other things are not. What was effective for you in getting the stakeholders re-engaged?

Adam: It's still a work in progress, to be honest.  We constantly harp on the risk of NOT transforming in this way.  The 2018 National Defense Strategy hits on this hard and all of our Senior leaders are pushing the same message.  So that has been really helpful.  General Brown, AF Chief of Staff, has done a great job of being clear about where we need to drive, so that allows us a bit of a trump card when we come into contact with someone who is trying to hold back progress.

Necco-LB: That idea of selling to stakeholders is really interesting, especially in the military. What did you have to say or do to convince your higher-up that  the counter-intuitive dev methodologies like releasing more frequently was worth a try?

Adam: We had no support early on.  The incentive process in the military rewards people who follow the rules and work within the system.  We sort of worked quietly off to the side on a project nobody really cared about to prove the value once delivered.  Once we got that delivered… we had MASSIVE dollar savings, so we started to be loud about it.  In fact, we were told not to use the “Kessel Run” moniker by higher ups… we decided to do it anyway and started promoting pretty hard.  By the time our first FastCompany article came out, all those senior leaders changed their tune and now they will say they were supporters all along. I am constantly doing that translation/evangelism work. And in the military, people swap out of positions every year or so generally, so some new person will get dropped in with no idea what’s going on and we need to start over again.  “Nope, the cloud is a real place…” Continuous Delivery has broken every gov process.  The test community doesn’t know what to do or look for… Requirement Managers don’t understand their place.  Configuration Management is just…. Different now.  I don’t need some guy managing a spreadsheet of what versions of software are deployed where.   We are in a weird transition period right now. In a lot of ways those stakeholders are sick of hearing from me.  I'm sure they hear Charlie-Brown-teacher-voice when I try to discuss this stuff at this point.  So we have worked on finding the proper champions in higher up places to do that work for us.  The very top of the Air Force totally gets it.  It's everyone in between who need to keep their head down to keep rising in the ranks.  (Tale as old as time...)

Necco-LB: Geez, what a thing. I can't believe you have the energy to continuously fight these battles within your own organization.

Cartoon of two people "Yeah so - don
Read more about Kessel Run and smuggling DevOps into the Department of Defense

Adam: Someone once told me a story about this dad who brought his family to the beach.  They had this big, pink inflatable bunny that the kids were using in the water.  Every so often the bunny would deflate and the kids would run back up the beach to the dad and he would huff and puff and blow it back up.  Kids would be happy and go back in the water.  An hour later, kids are back again and the dad is blowing it back up.  This person said "that is what innovation in the government is like".  Every once in awhile you need someone or something to "pump up your bunny".   The work I do is so exciting and fulfilling, all the BS that I have to deal with, all the money that government employees are leaving on the table, the bureaucracy ends up being worth it.

Necco-LB: That is a great analogy. Reminds me of how you talked about the mission driven culture at KR on the podcast. Can you talk about why you believe the culture of your organization is so important? And any advice you might have for organizations who are bifurcated?

Adam: Organizational alignment is incredibly important.  One place we have struggled is that we put such an emphasis on teams, that teams built strong individual identities.  They were empowered to solve their problem, but over time became less concerned about other teams' problems.  This was never more evident than working with the ops/platform teams.  The app teams knew what their users needed and all they cared about was meeting their needs.   Meanwhile, we had a whole organization with organizational outcomes that were the priority.  Let to a lack of empathy across teams and the communicate at the seams of teams was challenging. We are still digging ourselves out of that, but one thing we focus on is that mission-driven culture.  All it takes is a day like yesterday, with airstrikes in Syria, to level-set everyone on the seriousness of our work.  The mission aligns the teams towards a common goal and common outcomes.

luisfernandezbr: Adam. Thanks for the great tips. What were the big challenges that you had when increasing the dev team?  Things like knowledge sharing, share learning and maintain quality and excellence. Could you share some tips about this, if it is the case?

Adam: We sucked at all those things.  That mission-driven culture led us down the unenviable path about feeling so much pressure to deliver and support our users, that tech debt mounted and documentation suffered.  We struggled investing in automation in favor of getting short term wins.   The last year we have really rebalanced and ensured that we are providing space for our teams to organize their time better.  None of it was intentional, but regardless of what we said, we (leadership) were giving off the vibe that teams couldn't possibly slow down to invest in tech debt or spend time focusing on automating toil away.  We have had to be super clear that it is EXPECTED that teams work at a sustainable pace, invest in their code bases, invest in their professional growth and personal health and be okay saying "no". We have a lot of military members on our team, so saying "no" to your superiors is always a culture change we have to work on internally.

Necco-LB: Working on technical dept and automation vs. new features is something everyone can relate with for sure. I think we'll let Adam get back to his far more important job at Kessel Run. One last question, if someone here wants to get involved with Kessel Run, where can they go? Should they reach out to you?

Adam: This was fun- thanks for having me.  You can follow me on Twitter at @adamsfurtado or you can reach out directly at afurtado@kr.af.mil.  To follow along with KR, you can follow @kesselrunAF on most social media platform. I'd also like to plug that we are currently hiring for a bunch of roles from product leadership to engineers.  It's an incredible place to work and you can make a real impact.  Please take a look and reach out to me with any questions! Thanks again!

Antonette: Job opportunities at Kessel Run here: https://grnh.se/3201d1713us

_____________________

Starved for top-level software engineering content? Need some good tips on how to manage your team? This AMA is based on an episode of Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

Dev interrupted Discover our Most Popular Podcasts - with a variety of headshots from former speakers