Greg Tyler

Weeknotes: 9th December 2019

Published on 15th December 2019

This week I took a look at what we achieved this year.

I’ve had a busy week. On Monday evening we took the plunge and switched a huge swathe of our users across to the new permissions model we’ve been working on. Wednesday was marked by an outage, Thursday was our Christmas celebration, on Friday we finally traced some issues we’ve known about for months (years?).

It all felt like a lot was going on, particularly given the time of year. It was a scary, tense, stressful week. But I’ve been keen to look at the bright side: how far we’ve come.

In the last few weeks, Complete the deputy report has felt particularly unstable. We’ve had a couple of minor outages, and we’ve come across boat loads of errors. However, I argue that this is good news because earlier in the year those errors were already around and we just weren’t aware of them.

My overriding theme for the last quarter of the year has been improving our monitoring and dealing with the problems that come out of it. These are issues our users have seen every day, but we weren’t aware of. We now get errors when they happen, letting us be proactive rather than waiting for a bug report.

Similarly, our outage on Wednesday. Without wanting to go into much detail, it was one of those “these things happen” issues that can’t really be predicted. The good news though is that we resolved everything within 15 minutes, and were already fixing it before users noticed.

This is all a bit of a Catch 22 though (or something similar): if we hadn’t made the improvements and weren’t aware of these problems, then we wouldn’t be as concerned with them. Instead, we know about everything that’s wrong and collectively have an uncomfortable view that things have gotten worse.

So, as we approach the end of the year, it’s good to look at what we’ve achieved. Monitoring is better; infrastructure is more reliable; we understand our system better; testing has better coverage; we’ve fixed tonnes of bugs, many are 4+ years old; CI is effective.

Next week I’ll take a early shot at some 2020 goals, but I truly wouldn’t be able to without the great work of 2019. We’re beyond “fixing the basics” here and into the serious territory of building a great system. I think our main problem now is having too many ideas.

Summary