It’s important to remember that investment isn’t a completely altruistic act. While investors clearly want to encourage innovation, a primary motivation is to see a return on that investment. At the end of the day, they’re gambling that your idea will make them money.

This can make investing in true innovation tricky. True innovations are those rare game-changing technologies that revolutionize an industry. They’re notoriously difficult to spot. How often have you heard that people thought Apple would fail when they released the first iPhone or didn’t believe in Facebook when it first went public? True innovation rarely looks revolutionary to begin with. So how do investors spot which ideas are worth the effort?

We spoke with Jason Warner, managing director at Redpoint Ventures, to understand the reasoning behind investments and why investors are so picky.

1. Typical SaaS companies are easy to invest in, but true innovation doesn’t follow the same model

When developers start searching for investment, it can often be discouraging. While investors might not understand the intricacies of every technology company they invest in, they can at least spot the trends. They know and understand how a Software-as-a-service (SaaS) company grows.

If a company is growing, it has a very familiar pattern. And so investors can be quite confident that they’ll see a return. They’re much more willing to take a risk and ‘YOLO’ an investment.

“SaaS companies are really well understood in terms of how they grow,” explained Jason. “There is no real investor challenge to understand that if a company is growing 2x and its enterprise sales look good then … [investors] can just “yolo” invest into them. Because they understand what these companies look like … It’s all just Excel spreadsheets.” -On the Dev Interrupted Podcast at 40:29

2. Investors often wait until the first round of funding, but developers need seed funding

If you’re developing a revolutionary piece of technology, then it’s likely that you need investment to get you off the ground. However, it’s difficult for investors to sort the good from the bad. How do they know you’ll be successful, without a few years of revenue behind you? It’s a catch 22 situation. You need the investment to get those first few years, but the investors need to see a few years before they’re willing to invest.

Look at how Netflix completely surprised the world. Nobody predicted that it would change how we watch video (most of all Blockbuster, who fatefully ignored the potential). This is a trend that harks back decades. Online shopping, personal computers, the television, even electric light bulbs were all disregarded when they were first conceived.

These industry-changing innovations need investment much earlier than typical SaaS companies. And spotting what works is more of an art than a science.

“[Investors] miss the fundamentals. They can see the ones that are the trends,” Jason said. “It should [then] become obvious in the next round or the round after that from other investors … oh yeah, that is a great company.” -On the Dev Interrupted Podcast at 41:18

3. Developers need to seek out companies like Redpoint for seed investment

If you have a truly new idea, you’ll need to find an alternative to the usual investors. A company like Redpoint, which focuses on giving seed funding, is much more likely to take the time and actually investigate whether your technology will be a success.

It will take longer, of course. And it might not be the full amount you need to get your business started. But it’ll be what you need to begin building a proof of concept, get those first few years under your belt and start pitching to other investors.

“[If you’re] talking to a Redpoint investor, you should be flattered,” Jason explained. “What we’re thinking is that you are a majorly important company in the future. You have the potential to land … If Redpoint invests in you, we want it to basically mean that we think of you as a new primitive on the Internet or in whatever sector that you are in. And other people are going to build upon you.” -On the Dev Interrupted Podcast at 41:35

Listen to the full conversation

If you’d like to learn more about what Jason thinks and how to secure yourself an investment, catch the podcast on our website.

Starved for top-level software engineering content? Need some good tips on how to manage your team? This article is inspired by Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

Discover Our Most Popular Podcasts
Join the Dev Interrupted discord

Flow can mean many things but when it comes to workflow it usually refers to that feeling, discussed by Mihaly Csikszentmihalyi, when you enter a state of intense focus and lose yourself in an activity. 

Video games are a great example. They take advantage of this feeling to keep you immersed, which is why it’s so easy for gamers to “lose time” and just get wrapped up. The same feeling usually drives your most productive and best work.

When you manage developers, their workflow should be treasured and valued. That’s why, to improve developer focus, it’s vital to avoid weighing them down with minor interruptions or non-urgent pings. 

“Flow is characterized as this experience where the task that you're doing is perfectly matched to the skills that you have.” -Katie Wilde on the Dev Interrupted Podcast at 7:51

1. Acknowledge that it take 23 minutes for devs just to get into flow

Did you know that it takes 23 minutes to get into a flow state? For some people it takes even longer. That means that for every question, disruption, email, and interruption that you or your coworkers are subjected to, it could be half an hour of productivity down the drain. We talked to Katie Wilde, VP of Engineering at Ambassador Labs, about how she manages workflow

“Say you got a Slack ping, and you're like, “oh, I'll just ask a question.” How long does it take you to find the thread again? What's that total interrupt time? It's 23 minutes…that's been measured.” -on the Dev Interrupted Podcast at 11:11

2. Defrag dev calendars

Some interruptions are unavoidable but many of them aren’t. Planning your calendar in a way that works around the needs and workflows of your team is necessary to maximize everyone's productivity. 

For instance, scheduling meetings on days when weekly meetings already occur can help preserve focus time by not disrupting other working days. 

Devs need to communicate with their managers on what times they have available away from normal workflow and then it’s up to engineering leaders to plan around those schedules. As a dev leader, you have to look at your devs’ calendars, not your own, and react accordingly. 

“If you're a manager, when you're scheduling, don't look at your calendar, and then find a time and then see where you can slot the engineer in…look at the engineer's calendar and see, where can you tack the meeting on that it is after another meeting, or it is maybe at the start of the day, the end of the day… and ask them!” -Katie Wilde on the Dev Interrupted Podcast at 12:31

3. Suck it up - schedule your work around focus time

When managing large numbers of devs, it can seem like a chore to work around many different schedules or attempting to get meetings done only on specific days. We asked Katie what her trick to juggling so many different calendars and meetings was, and she had one thing to say: “Suck it up.”

Devs are the backbone of software production and it’s important to prioritize their productivity whenever possible. To help them stay on task and be able to really focus on their work, they need to have meetings planned around their day - not yours.

Providing consistency for your devs - meeting them when they are ready, available, and focused - helps them maintain a flow state and maximize productivity. But more than that, it’s the right thing to do. Devs want to build cool stuff, not have their days ruined by their own calendars.   

Katie says it best:

“That might mean that, as the manager, you have a little bit weirder hours. I hate to say this, but kind of suck it up… There's no way to get around that.”-on the Dev Interrupted Podcast at 13:23

Watch the full interview-

If you would like to hear more about how managers can work around a developers schedule and other great insight from Katie Wilde, check out the full podcast on your favorite podcasting application, Apple Podcasts, Spotify, Stitcher, YouTube

Starved for top-level software engineering content? Need some good tips on how to manage your team? This article is inspired by Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

Discover Our Most Popular Podcasts
Join the Dev Interrupted discord

In a typical manufacturing company, a supply chain is the chain of companies that you rely on to make your product. For example, a mobile phone manufacturer buys processor chips from a supplier. That supplier needs to buy a part from another manufacturer. And that manufacturer relies on yet another company for the raw metal.

But what is the software supply chain? And how do you keep it secure? We spoke with Kim Lewandowski, co-founder and head of product at Chainguard, to explain the details.

Your software supply chain is more complex than you think

The software supply chain can be complicated. Mainly because it’s difficult to know how far it reaches. Take a simple example: If you use Salesforce to keep track of your customers, you store your customers’ data on Salesforce’s servers. Not a problem, surely? But Salesforce could have a breach. And what about the servers themselves? Those servers might run on Windows. If that has a security bug, hackers have another way in. How about the software that Salesforce uses to host its website? If that is hacked, you have yet another breach.

 

“When I think of the software supply chain, it’s all the code and all the mechanics and the processes that went into delivering that core piece of software at the end,” Kim explained. “It’s all the bits and pieces that go into making these things.” -On the Dev Interrupted Podcast at 11:28

Keeping the software supply chain secure involves checking who has keys

The important part of keeping your supply chain secure is making sure that you track down what you’re using. And checking that they’re secure and reliable. Every new third party can be a potential problem. If you don’t do your due diligence, you won’t know what risks you’re taking.

As Kim explained, a favorite analogy of hers is thinking about doing construction work on your own home.

“You have a contractor. Well, they need keys. They have subcontractors. You give the keys out to all their subcontractors. Who are they? Where are they from? What materials are they bringing into your house?” -On the Dev Interrupted Podcast at 12:09

The more third party tools you use, the more out of control it can become

It all comes down to accountability. It can easily start spreading rapidly. One third-party tool that you use to create your software might rely on five separate third parties. And you don’t know what code they’ve got hidden under the hood. Your keys are suddenly all over the place.

The only way to keep it under control is to remind yourself to check and to do regular audits of the services you use. Kim believes it’s helpful to think of every new tool as a package coming to your home.

“How is your package getting to your house?” Kim said. “What truck is it riding on and who is driving those trucks?” -On the Dev Interrupted Podcast at 12:44

Get the full conversation

If you’d like to learn more about the software supply chain, and how to make sure that yours is secure, you can listen to the full conversation with Kim over on our podcast.

Starved for top-level software engineering content? Need some good tips on how to manage your team? This article is inspired by Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

Discover Our Most Popular Podcasts
Join the Dev Interrupted discord

Hiring neurodiverse developers can be challenging, particularly for smaller companies that are less experienced at hiring. This isn’t because you need an entirely new process or that neurodiverse people are inherently trickier to interview. It’s that small flaws in your hiring process get exacerbated. Obstacles that cause neurotypical people to stumble, become outright blockers to a neurodiverse person.

So we asked Matt Nigh, data engineering manager at UW Medicine, to give his tips on how to make sure your hiring process suits everybody.

“I think there are companies that other organizations could mimic,” Matt explained. “I would look at Google as one of probably the best that I’ve experienced.”-On the Dev Interrupted Podcast at 25:50

1. Interview processes should be conversational

If you use a lot of formal language, jargon and needlessly complicated words, you’ll make it much harder for your interviewee to understand what you want them to do. It also makes the interview artificial and cold, which can lead to unnecessary stress and anxiety in your interviewee. This is true for everybody, but for a neurodiverse developer, it can be much more potent.

 

“The most inclusive interview process I ever experienced was at Google,” Matt said. “And the reason I felt they had such an inclusive process is that it was wildly conversational. They were incredibly good at explaining what they were asking and what they were looking for. And to me, it was an incredibly friendly process.” -On the Dev Interrupted Podcast at 24:10

2. Neurodiverse developers prefer straightforward and clear instructions 

When giving instructions, particularly in practical tests, it’s important to make sure that you’re being clear and straightforward. Leaving ambiguity can cause problems, especially for neurodiverse developers. That ambiguity can distract away from the actual task at hand. The clearer your instructions, the better you’ll test a developer’s actual skills.

 

“I would say the reason I failed the system design interview was (and this is an example of what autism will do during an interview) it was the first system design interview I ever had. And I spent half the time trying to understand the language that the individual was using, rather than solving the problem, trying to make sure we’re just on the same page with what we were saying,” Matt said. -On the Dev Interrupted Podcast at 24:40

3. Neurodiverse developers need diverse recruiters, and stick around for longer once hired

Everyone has their own biases. While we should all strive to overcome those, it’s not always possible. The best way to avoid those problems is to make sure your interview team is diverse. Some coping mechanisms and strategies can seem strange to a neurotypical recruiter at first.

For example, someone with ADHD might ask you to repeat points or be typing as you speak. While it could initially look like they’re answering emails or not paying attention to you, it’s more likely that they’re taking notes to make sure they follow your instructions properly. The more diverse your recruiters, the fewer false assumptions you’ll make.

“Most recruiters are used to looking at neurotypical applicants, and they essentially have mental flags that come up with certain things, certain questions or anything like that,” Matt said. “Companies should ask: Do I have inclusive recruiters? So say, for example, at Google, they had incredibly inclusive recruiters. I was recruited by a deaf individual, for example. So this person very clearly understands me and anything that was going on.”-On the Dev Interrupted Podcast at 25:13

4. Neurodiverse developers could be more productive, and worth changing your processes

A program at Hewlett Packard Enterprise hired over 30 neurodiverse people in software testing roles at Australia’s Department of Human Services. The initial results from the program seem to suggest that those testing teams are 30% more productive than others, according to an article in the Harvard Business Review, called neurodiversity as a competitive advantage.

 

It would seem that, while a neurodiverse person might struggle in some areas—like the social anxiety brought on by an interview—they could exceed in others, such as pattern recognition.

Watch the full interview

If you’d like to hear more from Matt on neurodiversity in software development, you can watch the full podcast on our channel.

Starved for top-level software engineering content? Need some good tips on how to manage your team? This article is inspired by Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

Discover Our Most Popular Podcasts
Join the Dev Interrupted discord

Over the last ten years, technology has become more sophisticated. Faster. Smaller. More powerful. But it isn’t just our technology that’s evolving at a rapid pace. Our culture, attitudes and politics are all changing, too.

So what could the next ten years look like? How might businesses change to keep up with technology? We spoke with Jason Warner, managing director at Redpoint Ventures, to get his thoughts on the matter.

“Ten years is an interestingly long, but also short time horizon,” Jason explained. “It’s likely we’ll see a complete company cycle, maybe two macroeconomic cycles.” -On the Dev Interrupted Podcast at 10:29

1. Organizations will invest more in compliance and security

There have been a lot of large changes in recent years. People are working from home. Political tensions are high. And almost every device collects data about us. In all these cases, security is important. Securing our businesses, our national secrets, and our private lives.

It all leads to an inevitable conclusion. Jason believes that chief compliance officers will become commonplace, even in small companies. Protecting data is going to become a primary concern, for governments, businesses and people. Because, as the world gets more digital, we’re going to see more and more cyber attacks.

“Trends that I see happening are an increased awareness and investment in things like compliance and security. I think that if companies don’t have a chief compliance officer now, they likely will in the future,” Jason said. “I think it’s interesting when you see the geopolitical environment of how we might have to invest in more sophisticated tooling for national security. But more than that, it’s like understanding that we’re no longer a single micro-geo unit called the United States.” - On the Dev Interrupted Podcast at 11:03

2. Companies will focus on loyalty and subscriptions over one-off sales

The standard business model is outdated. In the past, technology companies sold software, they gave customers the software and that was the end of the transaction. But now, it’s more about building communities and regular interaction with your customers. It’s about subscriptions, regular payments or even donation models, seen on popular platforms like Twitch. Software isn’t a product any more. It’s a service.

But almost every company these days is a technology company. Just look at what’s happened to the taxi industry. The model has completely changed, simply because the technology has evolved. The old model won’t completely disappear, but we’ll see more and more industries move into a subscription model as new technology takes over.

“Selling is about adoption first and selling second. Someone’s got to reach for you first,” Jason explained. “Then, they’re going to find a value problem, then they’re going to want to give you money if they’re finding utility out of you.” On the Dev Interrupted Podcast at 11:21

3. Hardware is, and always will be, just as important as software

With every new innovation, we place more demands on the hardware we’re using. The more advanced our software becomes, the more powerful our hardware must be. But right now, most  companies rely on international trade to build key components. With tensions rising, it’s likely that we’ll see companies begin to bring these resources closer to home, securing their supply chain in the process.

“There’s interestingly a lot more emphasis on investing in hardware again,” Jason said. “And America in particular owning its hardware manufacturing, which I think is obviously good.” -On the Dev Interrupted Podcast 11:41

Watch the full interview

If you’re interested in what else Jason had to say about the next ten years, and what challenges society faces, you can watch the full podcast on our site.

Starved for top-level software engineering content? Need some good tips on how to manage your team? This article is inspired by Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

At Netflix, we don’t just think about productivity - we engineer it. There’s an entire team within Netflix dedicated to productivity. I lead the Develop Domain along with my Delivery and Observability Domain peers, and together, we make up Productivity Engineering.

I recently sat down with the Dev Interrupted podcast to discuss all things productivity, how I run my team, and how other managers should view employee success. Here’s how we think about it at Netflix:

Can productivity be engineered?

In short, yes! Productivity is not a generic term for team performance or a perfunctory buzzword used during team meetings. The productivity team is an actual organization. The work we do is foundational to Netflix’s development teams. Productivity Engineering lives within the broader, central Platform organization.

The role of the Productivity Engineering team is simple: we exist to make the lives of Netflix developers easier. Abstracting away the various “Netflix-isms” around development, delivery, and observability, productivity allows devs more time to focus on their domain of expertise. 

“We are sort of like the nerds’ nerds, if you will, enabling them to use our platforms and tools so that the work that they're doing is focused on studio and streaming, without thinking about everything that's under the hood.” - On the Dev Interrupted Podcast at 2:31

With the recent addition of Gaming to the list of Netflix’s pursuits, the resulting focus becomes even more important.

Practically speaking, it’s the role of Productivity Engineering to help with things like coding, testing, debugging, dependency management, deployment, alerting, monitoring, performance, incident response, to name a bunch. Netflix utilizes the concept of a “paved road,” the frameworks, platforms, apps, and tools we build and support to keep our devs rolling. The idea is to keep workflows streamlined and enable developers to operate as efficiently and effectively as possible. If the road ahead is cleared of obstacles, you’re going to get to where you need to go faster and with support along the way. 

It’s also about helping developers enjoy the ride. To abuse another metaphor, a sound engineering experience should be like dining at a fine restaurant. If done right, you rarely remember the waitstaff, have a hard time finding something you like, or worry about how they prepared the food; you simply enjoy the experience. If Productivity Engineering is doing their job, they act as the restaurant and waitstaff with developers as the customer, providing nothing short of a beautiful end-to-end experience. 

Measuring Outcomes vs. Output

Measuring all of that productivity can be hard, and there’s no one unicorn measurement to rule them all. Hence, developer productivity teams should focus on impact and outcomes. Above all, Netflix focuses on customer satisfaction. Our philosophy is that while how something is delivered is important, the impact of what’s delivered is ultimately of greater importance. 

"If you're running around a track super-fast, but you're on the wrong track, does it matter? So really, what are you delivering? How you're delivering is important. But if that thing that you're delivering is ultimately doing what you want it to do, that's the most important thing." - On the Dev Interrupted Podcast at 5:05

In this model, the outcome always wins over output or activity. For instance, standard productivity deployment metrics (DORA) as applied to our customers become an important proxy for measuring our success. Key Performance Indicators (KPIs) for productivity are viewed as a reflection of a team’s performance as it relates to customer satisfaction.

I’m a big fan of the SPACE framework, developed by Nicole Forsgren, for precisely this reason. How are our customers doing in terms of Satisfaction, Performance, Activity, Communication, and Efficiency? The answer to those questions reflects how we’re doing as a Productivity organization.

"This is our strategy, these are our hypotheses around, how we're going to improve our customers' productivity. Are those things paying off? And if you can't measure them in some way, who knows? Right? So yeah, we're getting a little more hardcore about this." - On the Dev Interrupted Podcast at 24:17

Key metrics provide productivity teams with a holistic view of performance by establishing benchmarks. Understanding that everything needs to be viewed within the proper context, it’s difficult to improve as an organization if nothing is measured or tracked. 

Comparing Productivity 

Comparing developers’ productivity across teams is a thorny subject at best and downright dangerous for team morale at worst. As the old saying goes, “Comparison is the thief of joy” or what I typically say, “comparisons lead to unhappiness”, or with my kids “eyes on your own paper!”. 

The productivity teams at Netflix take a contextualized view of dev teams rather than relying solely on raw data. Every project is different, the customer base is different, the use case is different, personas are different, and where a team is within the software development life cycle is different.

It’s a basic understanding that comparing apples to oranges is not good math. A team that is just starting out and building something new, is going to look very different than a team with a mature product. By recognizing this, it becomes almost impossible to rank teams against each other because very rarely, if ever, will teams be doing the same thing, in the same space, the same way, with the same people. 

Even a measurement of an outcome pertaining to customer satisfaction (CSAT) is not straightforward. At Netflix and across the industry, we’ve found that satisfaction for internal teams skews lower than satisfaction for customer-facing teams.

The reason? Teams within Netflix are their own harshest critics. When attempting to gauge the performance of an internal team vs a customer-facing team, it’s understood that the internal team is almost always going to score lower on satisfaction, even if both teams are equally effective. 

Context is everything. Measuring productivity means being mindful of context. 

Pushing Productivity 

Any company that wants to be successful must understand how to measure its success. Productivity doesn’t count for much if an organization is not moving towards desired outcomes. 

By viewing productivity as more than just a concept or a raw set of data, the hard-working teams at Netflix have turned productivity into an actual apparatus. It is a living, breathing team of human beings whose devotion to empathetic efficiency improves customer satisfaction and dev team quality of life. I am incredibly proud to lead these teams, and I sincerely hope the work we do inspires other organizations to improve their developers’ experience.

And if you want to be as productive as Netflix, remember that metrics are only as good as their context! 


If you enjoyed this article and you would like to learn more about the work that I do at Netflix, I invite you to come join me at INTERACT on April 7th

This will be the second time that I have sat down for a panel discussion hosted by Dev Interrupted. I love being a member of the Dev Interrupted community because they are such an amazing resource. If you are a team lead, engineering manager, VP or CTO looking to improve your team, come to INTERACT and check out the community - I promise you will learn something.

Pretend you are watching your favorite show on Netflix: Sit back, relax & watch as I share the stage with other amazing engineering leaders from places like Slack, Stack Overflow, American Express, Outsystems, Drata & many more.

>Register Here<

Chaos Engineering might sound like a buzzword - but take it from someone who used to joke his job title was Chief Chaos Engineer (more on that later) it is much more than buzz or a passing fad - it’s a practice. 

The world can be a scary place and more and more companies are beginning to turn to Chaos Engineering to proactively poke and prod their systems and in doing so are improving their reliability and guarding against unexpected failures in production and unplanned downtime. 

During my career I dealt with my fair share of outages, including one that caught me mid-song during a bout of karaoke and far too many that woke me up at 02:00. As the co-founder and CTO at Gremlin, I do my best to make sure no other engineers have to suffer sleepless nights worrying about their product. 

But the question remains, what is Chaos Engineering and where did it come from?

A Short History

The spiritual predecessor to Chaos Engineering is often called by a much more widely recognized name - disaster recovery. The focus when this practice was introduced is much the same as today: proactively suss out production problems by injecting failure. 

Netflix’s Chaos Monkey is probably the most well publicized Chaos Engineering tool as it arguably kickstarted the adoption of Chaos Engineering outside of large companies, but this has led to the erroneous belief that Netflix invented the practice. In fact, the practice was already widely in use amongst the titans of technology. 

Over a decade ago during my time as a Lead Software Engineer at Amazon, we implemented several crude practices designed to inject failure into our systems. The most rudimentary of which was employed by a man called Jesse Robbins, who earned the nickname “Master of Disaster” by running through data centers pulling out cables. 

Let’s just say the practice has evolved a lot since those early days and your data center cables are much safer these days.

What is Chaos Engineering?

“What Chaos Engineering really is, is the art, if you want to call it that, of introducing controlled chaos.” - 2:16 on the Dev Interrupted podcast

At its core, Chaos Engineering is a disciplined approach of identifying potential failures before they have an opportunity to become customer facing outages. 

It is a practice that lets you safely test your assumption about how your systems will behave under duress by actually exercising resilient mechanisms in a controlled fashion. You literally "break things on purpose" to validate and build resiliency. The end goal of Chaos Engineering is not to inject arbitrary failure into a system, but rather to strategically inject turbulence to enhance the stability and resiliency of your systems.

How Chaotic is Chaos Engineering?

I always tell people that Chaos Engineering is a bit of a misnomer because it’s actually as far from chaotic as you can get. When performed correctly everything is in control of the operator. That mentality is the reason our core product principles at Gremlin are: safety, simplicity and security. True chaos can be daunting and can cause harm. But controlled chaos fosters confidence in the resilience of systems and allows for operators to sleep a little easier knowing they’ve tested their assumptions. After all, the laws of entropy guarantee the world will consistently keep throwing randomness at you and your systems. You shouldn’t have to help with that.

How do I Start?

One of the most common questions I receive is: “I want to get started with Chaos Engineering, where do I begin?” There is no one size fits all answer unfortunately. You could start by validating your observability tooling, ensuring auto-scaling works, testing failover conditions, or one of a myriad of other use cases. The one thing that does apply across all of these use cases is start slow, but do not be slow to start.

What I mean by this is to start testing across just a few nodes versus impacting your entire fleet. We refer to the impacted area as the “blast radius” and we highly recommend starting with a small blast radius (the number of systems impacted) and increasing it over time.

By starting small you allow yourself to gain confidence in both the experiments you are running and your systems. Of course your risk tolerance is also a factor of how large a blast radius your organization will use. 

For instance, a large banking institution with millions of customers has a much lower risk tolerance than a tech startup with a couple hundred customers. In that case, they would want to run experiments in a programmatic way and would need to be very explicit about communicating to the rest of the organization what tests are going to be run and when to avoid any unplanned 2am or 3am disasters. 

Eventually you want to get to the point where all of this is automated, a process we refer to as “continuous chaos.” Starting small with automation could be something as simple as taking out a single node; then taking out five nodes; then ten; and so on. Eventually you automate the process at a level you are comfortable with.  

“Ultimately you want to be able to handle any of this random chaos being thrown at you, because that's what the world is, it's entropy, it's degradation” - 7:35 on the Dev Interrupted podcast

No Tolerance for Downtime

When I founded Gremlin, it was just myself and my co-founder developing the first iteration of the product. The business looked very different then and I jokingly referred to myself as the “Chief Chaos Engineer” responsible for implementing code that was mostly used by enterprise companies. Many of these companies came to us because they had reliance thrust upon them by the US government or they had top-down reliability standards and they wanted a tool to help them shore up their systems. 

As the company began to evolve, so did the customer base. These days it’s not just Fortune 500 companies that care about reliability, it’s everybody. Planned downtime is a relic of days gone by. It is no longer acceptable to espouse planned maintenance windows as part of development lifecycles and customers don’t have the patience for products they rely upon to spend any time unavailable. Companies recognize this dynamic - and it’s not a hard one to miss. 

Seemingly our appetite for technology has gone up exponentially while our ability to stomach downtime has drastically decreased. Customers expect that your product is always working, always running. If your product is down because of outages then there are ten other similar products waiting in the wings to take their money. 

Making Lives Better

Visibility is high these days and companies don’t need the publicity that comes with making any unforced errors, let alone to be subject to errors not of their making. No one wants to be blown up on Twitter because their product isn’t working or because one of their downstream dependencies or their cloud provider had an unexpected outage. 

By preparing for the worst, we can be at our best as an industry and can be prepared when disaster eventually comes knocking. That’s why when an unexpected outage occurs or there is a production failure customers will never even know it happened. 

I often joke that we are the engineers’ engineers because many of us know that feeling of being jolted from a dream at 03:00 by our pagers, groggily wiping our eyes and whipping out the laptop to go dig through a sea of monitoring dashboards and logs. It’s not fun and it’s exactly why I founded Gremlin. Because there is a better way to approach operations than merely sitting back on our haunches and waiting for the next outage. Chaos Engineering not only helps to protect against the randomness of the world, but also teaches people how to build more reliable software. And if enough people build more reliable software, we build a more reliable internet.

_____________________

Starved for top-level software engineering content? Need some good tips on how to manage your team? This article is inspired by Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

Continuous Delivery isn’t about how fast you can deliver, it’s about the outcome your delivery achieves. Bryan Finster, author of the 5-minute DevOps series and founder of the DevOps Dojo, joined our Dev Interrupted Discord community to answer your questions about outcome-based development, continuous delivery, and why failing small is better than failing fast. 

Bryan is currently a Distinguished Engineer at Defense Unicorns but has also worked for Walmart as a systems analyst and eventually became a staff software engineer for Walmart Labs. He had previously appeared on the Dev Interrupted Podcast to further talk about these subjects as well as the most common pitfalls dev teams find when trying to optimize their delivery process. Listen to the episode here:

This Community AMA took place on January 8, 2021 on the Dev Interrupted Discord.

Necco-LB: 📢📢 Community AMA📢📢   @everyone 

Topic: Outcome-based Development with @BryanF (Bryan Finster)

Bryan, thanks for joining us today!

Bryan Finster: Thanks for having me!

col: Bryan... great quote. "A developer is a business expert who solves problems with code." Thank you. Tremendous concept.

Bryan Finster: Thanks. That's who we are. We aren't Java spewing legos. If we don't understand the business, the code won't.

Rocco Seyboth: YES!! @col Love it. @oriker says "a business decision is made with every line of code"

Bryan: Exactly. How does this change improve the bottom line. Even more, how does it improve the lives of our customers?

Necco-LB: We really enjoyed having you on the podcast to talk about Outcome-based development and what continuous delivery should be trying to achieve. I was hoping you could explain to use what Outcome-based development means?

Bryan: It's just focusing on the outcomes. It's pointless to focus on how we do things if the outcomes are poor. It's also about Hypothesis Driven Development. The act of defining the expected value before we attempt to deliver it and then measuring for that value. Instrumenting the application to see how close we get so we can adjust. I frequently see people just being feature factories, pounding out changes that no one needs. That just costs money and increases support. We should be deliberate about what we do and say "no" when the value isn't obvious.

Cocco: When it comes to delivering value to the customer sooner, what things do you commonly see teams worrying about that they perhaps shouldn't (or not worry about, when they should?)

Bryan: "I can't release this! It's not feature complete!" No, get the incomplete change out there and make sure it doesn't break anything.

Necco-LB: You mentioned during the podcast that Pride is the best metric ever. Can you explain that a little bit?

Bryan: If I own the business problem, own the solution, own how to make it better, own the outcomes and see people getting value from my work, then I have pride in what I do. I want it to be good. I want it to be secure and stable and I want to continuously improve it.

Necco-LB: When you talk about outcome-based development you often talk about the things that need to happen before hands touch the keyboard. What are some of those things?

Bryan: We need to understand the value we are trying to deliver and we need to define how we expect to deliver that value at the detail level. It's not enough to write a vague user story. We need testable outcomes that we agree should deliver that value. Behavior Driven Development is the most effective tool I've found for that. We also need to make sure we aren't trying to deliver ALL of the value at once. What if we are wrong? We usually are, statistically. So, what is the smallest, highest value thing we can deliver to find out? Sometimes the right answer is to stop at that point. Invest in the outcomes, not the plan or the work.

_____________________

Read the unedited AMA and join in the discussion in the Dev Interrupted Discord here! With over 2000 members, the Dev Interrupted Discord Community is the best place for Engineering Leaders to engage in daily conversation. Join the community >>

Dev Interrupted Discord, the new faces of engineering leadership

_____________________

Cocco: What patterns/trends do you see in teams who can deliver the outcomes they want? (Are there common factors in teams you've seen that move from struggling -> successful?)

Bryan: Yes. Actual continuous delivery and product ownership. They can deliver small changes daily and they have ownership of what those changes are. They have the safety to challenge things without fear and they are not pushed so hard that there is no time to think of better ideas. Software development is a mental activity, not typing.

Necco-LB: You work with a lot of different teams at the DevOps Dojo. What are some of the most common pitfalls preventing a team from optimizing their delivery process?

Bryan: They are given the wrong problems to solve. They are asked to solve stupid problems like "how many changes did you make today?", "How many stories did you complete this sprint?", They don't know how to work as teams because they are incentivized to work in silos. So, requirements are poorly defined, testing suffers, speed suffers. They need to be solving the business problem. What is measured will change. Be careful what and how you measure.

Necco-LB: What are some first steps a team can take if they want to become more outcome focused?

Bryan: Focus on the business problem and get close to the user. Empathize with them and what value they need. This really applies to anything. If you don't respect your customer, you won't need to worry about them for very long.

Necco-LB: What is the role/responsibility of the developer in this outcome-based development model?

Bryan: On a good development team you have engineers and product ownership. Engineers ship working solutions. They know they are working because they tested them, delivered, them and observed that their tests were accurate.

Rocco Seyboth: In 5 Minute DevOps you talk about observing what high performing teams do then modeling other teams to the same process and behavior... how do you reconcile that with the belief that every team is different and should have the flexibility to do things their own way?

Bryan: Actually, I advocate against cookie cutter templating of teams in that post. We should standardize on improving outcomes.

Necco-LB: Friends, that's just about the top of the hour. Bryan has a real job that needs to get done, but feel free to keep the questions coming asynchronously throughout the day - he'll be popping in and out to answer them. Bryan - thank you so much for joining our community today and answering our questions!

Bryan: Just some contact links to leave and I want to thank everyone for the conversation. I love talking about these topics.
https://www.linkedin.com/in/bryan-finster/

https://bdfinst.medium.com/

_____________________

Starved for top-level software engineering content? Need some good tips on how to manage your team? This AMA is based on an episode of Dev Interrupted - the go-to podcast for engineering leaders.

Dev Interrupted features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. With new guests every week from Google to small startups, the Dev Interrupted Podcast is a fresh look at the world of software engineering and engineering management.

Listen and subscribe on your streaming service of choice today.

To fight the wars of the future the US Air Force tasked a small group of software engineers with a simple job - revolutionize the way the military thinks about software development.

The group tasked with this not-so-tiny problem came to call themselves “Kessel Run” after the famed smuggling route used by Han Solo in Star Wars.

Since starting in 2017, the team at Kessel Run has expanded to include over 1,300 people across multiple locations, helping build, test, deliver, operate and maintain cloud-based infrastructure and warfighting software. These applications are used by airmen worldwide and represent the future of warfare.

That’s because the
wars of the future will be fought with software and system architecture as much as any other weapon.

 

What is Kessel Run? 

Han Solo smuggles DevOps in the Department of Defense.

“[Kessel Run] was kicked off about four years ago as a way to prove that the Department of Defense didn't have to be terrible at building and delivering software, regardless of being within the world's largest bureaucracy.” - Adam Furtado, on the Dev Interrupted podcast at 1:35

As an Airforce organization, Kessel Run delivers a wide variety of mission capabilities to warfighters around the world, utilizing industry best practices around DevOps and Agile. At the time of its inception, it represented such a radical departure from the normal way of thinking within the Department of Defense (DoD) that people joked it would have to be “smuggled” into the DoD.

That’s how Kessel Run came to earn its name - a scruffy team outfitted with a mission to upend a stodgy and cumbersome bureaucracy. 

A shift in thinking needed to start with culture. The team at Kessel Run decided to bring a startup like mentality to the behemoth that is the federal government with a goal of introducing modern software methodologies at scale. Pockets within the DoD were practings things like continuous delivery, but prior to Kessel Run, previous attempts to adopt modern software principles had largely failed. Warfighters weren’t getting the capabilities or tools they needed. 

Problem Solvers

One of the biggest institutional problems that Kessel Run was tasked with trying to improve were the Air Force’s Air Operations Centers. Spread around the world across twenty two locations, these organizations manage all the details that involve fighting an air war. Everything from strategy, to planning, to tasking aircraft to perform certain actions, to providing real time intelligence data and feedback, are handled at Air Operations Centers. 

The challenge was modernizing these centers while maintaining operational readiness and current hardware - much of which was 20 to 30 years old. All of the hardware across these locations came with its own integrated software, built from various third party sources over decades. 

To tackle this challenge the team at Kessel Run applied the principles of Gall’s Law, which states that all complex systems that work evolved from simpler systems that worked. 

By starting small and focusing on rapidly achievable solutions, they began to see the network effects of their actions. Small, precise fixes can have tremendous impacts on an organization and are less prone to failure than attempting systematic change overnight. 

 “So we knew that using Gall’s Law in history, that we needed to start small, in order to make this work. We couldn't just have a big bang approach to replace this entire system. Right? You did that by chipping away at some core parts of the system from a user functionality perspective.” - Adam Furtado, on the Dev Interrupted podcast at 13:31

Practical Success

One of the first small changes achieved by Kessel Run was with the Air Force’s air refueling program.

A remarkable acrobatic feat performed at more than 20,000 feet above ground, at speeds close to 400 miles per hour, replenishing the fuel of an aircraft is dangerous but necessary work. Everyday, fighter jets and bombers rendezvous with fuel air tankers to perform air-to-air refueling before continuing on with their mission. 

Optimizing the details of such a delicate dance would be difficult, but the folks at Kessel Run believed they could do it. First, they needed software engineers. One of the problems of developing software at the federal government is a lack of engineers. Or rather, a lack of native engineers that can be found in-house. Historically speaking, the government outsources everything to contractors. 

Scrounging the Air Force for active duty software engineers scattered across separate programs, Kessel Run was able to stitch together it’s own homegrown software engineering team.

With their mission in hand, they set to work building an initial application nicknamed “Jigsaw” to improve the air refueling process. By optimizing every aspect of the process from the timing, to the altitude, Jigsaw became an enormous success. Within a year of implementation the Air Force was saving $12.8 million a month on fuel. 

Refueling Jets is expensive.

 

 

Tiny, targeted successes like these continued. But Kessel Run was up against more than just inefficient programs. 

A New Way of Thinking

Changing company culture is notoriously difficult. Changing culture inside the world’s largest bureaucracy is as hard as it gets. 

The most difficult problem that Kessel Run had to tackle wasn’t the lack of software developers, or the difficulty of integrating third party software applications, or figuring out how to optimize and build combat applications, it was how to communicate with their peers in the DoD. 

Part of the difficulty was due to the security implications of such work. The production environments are all on classified systems, making things like cloud implementation and tooling availability difficult.  

However, navigating the business side of the DoD was always the most challenging. In the past 30 years the government has spent over a billion dollars trying to update their systems to provide the best capabilities possible to warfighters in order to prepare for a war that may never happen.

Until Kessel Run, the government didn’t have much to show for their efforts. A perception existed that new software methodologies and practices were just the next iteration of technologies that overpromised and underdelivered. It took a lot of trust to explain that doing something in a more agile way or using DevOps, would actually reduce risk and increase success for the organization.

“The problem we have is we go and talk about how deployment frequency is going to buy down risk for us. That sounds counterintuitive to everybody in the world, particularly in a military environment, where they're like, ‘What do you mean? Change is scary. I don't change stuff.’ So we're having these kind of counterintuitive conversations around why moving to this way of working is less risky and increases our chances of success.” - Adam Furtado, on the Dev Interrupted podcast at 6:39

Solving that problem came down to nothing more than old fashioned relationship building. It took years of evangelism and continued success, but eventually Kessel Run started to win the approval of the right people in the right places. 

Proof is in the Pudding

From starting as an organization with only 5 software engineers, to expanding into a program that currently has over 1300 people, Kessel Run has proven itself to be an ingenious concept: bring startup culture to an old organization in need of modern ways of thinking. 

Government has never been the place that attracted the top talent in technology, but with Kessel Run it’s become that. They provide access to the newest technologies, competing with some of the best companies in the industry. 

They do have one ace up their sleeve when it comes to hiring: fighter jets. And those are pretty cool. 

If you want to learn more about the history and story of Kessel Run, consider listening to the Dev Interrupted podcast featuring Adam Furtado, Kessel Run’s Chief of Platform. 

Dev Interrupted is a weekly podcast featuring a wide array of software engineering leaders and experts, exploring topics from dev team metrics to accelerating delivery. 

______________________________________________________________________________________________________________________________________

If you haven’t already joined the best developer discord out there, WYD?

Look, I know we talk about it a lot but we love our developer discord community. With over 2000 members, the Dev Interrupted Discord Community is the best place for Engineering Leaders to engage in daily conversation. No salespeople allowed. Join the community >>