video • 11MIN

Engineering Screw Ups: When to Automate

As Co-Founder and CTO, you have to make tough decisions with speed but sometimes those decisions bite you in the ass. Here's what happened - and the lessons Luca took from the time he encouraged a massive cluster of technical debt. Luca gave this talk September 30th for INTERACT 2021.

 

Transcription:

Conor Bronsdon:
Hey, everyone.
I'm Conor Brondson, one of your Dev Interrupted community leaders, and today I'm joined by Luca Rossi. Luca, can you introduce yourself to those of us in the audience who may not be familiar with you?

Luca Rossi:
Hi, Conor. Thank you so much for having me.

Conor:
Absolutely.

Luca:
I write an engineering leadership newsletter called refactoring.club.
It goes out weekly to more than 10,000 subscribers

Conor:
I'm one of them!

Luca:
I know, thank you!
I'm Head of Engineering at translated.com and I've been Co-founder and CTO of wanderio.com which was a visit back startup in the travel space in Italy.

Conor:
Fantastic. Well, we're really excited to have you here
with us today, and I'm stoked to have you share with the INTERACT audience this engineering leadership screw up story that I heard you have. Can you tell us a bit about that?

Luca:
Yeah. I mean, it was hard to choose one because we have had so many,
so I had to choose the one that feels more like the most to me. And also that has maybe the best lessons learned that I could think of. So this story is about doing something that didn't really scale, and it was not a bad thing by itself, but it turned out to be kind of a nightmare for some aspects. My startup worked in a travel space and allowed people to compare and book transportation: flights, trains, buses. And for some time we worked with the meta search approach. So if you search how to go from A to B, you would get how to go with flights, with trains. And then you went to the transport company's website to book and to pay for the tickets like Skyscanner does, right? But at some point we saw the opportunity to close the funnel within our website, within our app. So to make people make the payment and book the tickets on our own platforms. But we weren't really sure that would be the right call from the business point of view, from the user experience point of view. And it would also be a very intensive engineering activity because we would have to connect the booking systems of all of our partners and there were maybe tens of them. And we estimated that it would take several months, like four to five months to do this. So we decided to try to do something. Just try it out if this would work in a lean way, do things that don't scale way, and we just integrated the payment like Stripe, to make people make the payment on the website. But then when a booking comes, we would get a notification and we would go manually to the transport company website and book the tickets.

Conor:
So the engineering team is manually doing it or who is mainly doing this?

Luca:
At the beginning, it was, I mean, the three founders of us.
Yeah, it's all on founders. When in doubt, it's on founders. And so we released it, and it was actually a great success. We were really happy because people were happy about the feature. We actually released this in very little time to market, let's say. And we started receiving these bookings that they weren't even like, so many bookings because our volumes were small. But then the problems started very soon because even if there were a few bookings, they could really come any time, either during the day or during the night, over weekends. And so we had to make shifts to manage them. And literally, wherever we went, we had to go with backpacks because... And we all have stories about the strangest situations of places where we had to interrupt what we were doing...

Conor:
What's your number one place where it was like: I have to go to this now

Luca:
I have an incredible memory of me taking out my laptop
like, 03:00 a.m. In the morning, in a pub where everyone was, like, half-sleeping, you know, dim lights. And there, at some point I opened my MacBook, shine the light in all the bar. And, you know, the startup sound of the MacBook. It's like full volume. And everybody thinks what's this guy doing. And I was booking a flight on the British Airways because the customer would depart, like in the morning.

Conor:
How long did this go on for?

Luca:
Here it comes the horror story because we thought that it would take, eventually,
like, four to five months to automate all this stuff.

Conor:
Yeah, tough but you could deal with it.

Luca:
Yes.
While instead it took, I think, one year and a half to two years to automate like, 95% of the bookings. So the 5% never went away, basically because of edge cases and things like that. And we ended up managing, like, 50,000 bookings manually this way.

Conor:
That's a lot of time.

Luca:
It's a lot of time.
And so at the beginning, just the founders, then we had to give these shifts to other people in the team. And then we eventually hired dedicated people, like people who could care about the customer success and booking desk operations as we call them.

Conor:
So it helped you get in the market quickly.
And yes, there was some success there. But it sounds like there were a lot of trade offs because of that.

Luca:
Yes, definitely. And I think we made a few mistakes
with this because I think in the end it was the right call for the business. But our... I mean, I think we made two mistakes. The first mistake is that we didn't really plan for what would come next. And so we did this thing of saying, okay, let's release this in a lean way, but we didn't properly iterate on it from the day after. And so in the end, it wasn't lean. It was just poorly thought from the day after and working on that after some time, everything becomes harder because you lose the context. You maybe start working another thing. And it becomes hard to properly prioritize this because at the beginning, when you have, like, ten bookings per day. It's hard to make the call of saying, let's take the five to six months of the engineering team to automate something to save time for ten bookings per day. You know? But when you are in a high growth environment, there are issues that grow with the growth. 70% of them is quite easy to see, if I had to be honest. But the only way to address them properly is to address them while they're small. Otherwise, it becomes easily too late. And it became too late I mean, periodically because we went into the cycle of automating some partners and then things seem to be better and then eventually got out of control again. And it caused a lot of friction because the way the parts of the team that had to deal with the bookings, with the support that periodically thought that the engineering didn't care about them, you know, that they preferred to work on revenue generating things instead indirectly. That was, of course, revenue generating part of the company because it improves the efficiency, effectiveness of people. So it's been hard, but we eventually got there and we definitely learned from this process.

Conor:
Yeah, it's really interesting to hear you talk about it because it sounds like
yes, it was maybe initially the right call, but it was the decisions afterwards that compounded on each other that made it turned into such a boondoggle.

Luca:
Yes. Yes, definitely.
And I think if we had to go back in time, the right way would have been to plan for this early release, but also to plan for maybe different scenarios and how they would play out, but start working on automation right away. So as soon as we saw that this feature was the right call, we would have to make a proper plan or maybe do it in part in advance about working on integrating search APIs from our partners and so on, because that would have given also the team, you know, the tranquility, the conviction that we were working on this, even if it would take a longtime, that we would care about this. And even if the roadmap was long, maybe we wouldn't be able to do this in four to five months anyway. Maybe it would have taken one year. But we were on it. This was the priority in the company.

Conor:
That sounds like it affected team dynamics and team culture and togetherness.

Luca:
Yes, yes. This affected the planning part of the company, the culture,
the engagement of people because, I mean, eventually our business was about making bookings. So this was like the core of the booking. So we had to make this right, you know.

Conor:
Absolutely. Wow. That is such an interesting and I think
in-depth story because it gives us so many angles to consider it by whether it was the right decision initially, how it kind of compounded and changed. Thank you so much for taking the time to tell that, I know our audience is loving this. Luca, thank you for taking the time to join us for INTERACT.

Luca:
Thank you, thank you for having me. It was my pleasure.

Explore your dev team's metrics in less than 3 mins!

I'm interested