David Knott 31/10/2024 David Knott 31/10/2024

Are you overfitting your delivery decisions - and to the wrong dataset?

Overfitting is a term from the world of machine learning. It describes the problem that arises when you train a model on a dataset, and use that model to make accurate predictions - as long as they are limited to that dataset. Unfortunately, the model starts to go haywire when you apply it to the wider world.

Imagine a fictional example: you wish to use data about students to predict their academic performance. You train your model on a class where there are several students called ‘Geoff’, and they all happen to be doing very well in their studies - there’s no reason for this: it’s pure coincidence. But the data doesn’t know that. When you make predictions about this class only, your model seems to work: it finds the high performing Geoffs. When you use it to make wider predictions, however, in a world where there are rather fewer Geoffs, and where being a Geoff is not a reliable indicator of academic performance, the model stops working.

David Knott 26/01/2023 David Knott 26/01/2023

How do we get what we want from AI systems - and human systems?

I am sure that you have heard of the paperclip problem. Just in case you haven’t, it is the idea that, if you ask an AI system to make paperclips, then it may go on making paperclips, until the whole world is nothing but paperclips. There’s even a fun game based on this concept.

The paperclip problem illustrates the problem of setting goals for AI systems which represent what we truly want. Unlike us, AI systems do not come ready equipped with goals and desires: we have to provide them, in the form of what is often known as a reward function.

And crafting this function can be more difficult than it first appears. When I wrote some recent articles on generative AI, it was suggested that I read the book Human Compatible by Stuart Russell. It’s a great book, and triggered lots of other reading: it took me down a rabbit hole of articles and papers about optimisation, particularly a phenomenon known as specification gaming.

David Knott 14/10/2021 David Knott 14/10/2021

Architecture year one: measuring success

I have a flippant answer when asked how to measure the success of an architecture team: just check how often your phone is ringing. (I realise that this dates my advice - please substitute ringing phones for the messaging app of your choice.) If you and your team are in demand, if you are the first call whenever your company faces a big challenge or opportunity, then you are doing something right. If your phone (or messaging app) is silent, if you have to push to insert yourself into the room with tough problems, then you have more work to do.

However, recently I was talking to a friend and colleague who has just taken on a Chief Architect role in a new company, and who is attempting to design measures of success for their team. In that conversation, I realised that we needed to come up with something a bit more sophisticated than the number of messages on your phone.

Are you overfitting your delivery decisions - and to the wrong dataset?

How do we get what we want from AI systems - and human systems?

Architecture year one: measuring success

What are you optimising for?