Interesting engineering problem #1: survival

4 Apr

*Photo credit: Patrick Perkins on Unsplash*

That’s the prototype: now we just need to turn it into a production application.

The people at the front of the room clap. They have just been taken through a whirlwind of demoes, slides and post-it notes. They have just been shown what the presenter keeps referring to as the art of the possible, and they never imagined that so much was possible. They want the features that they have been shown, and they want them now.

The people in the middle of the room frown. They are wondering where the resources and budget will come from. They are wondering where the work will fit in their portfolio of projects or on their backlogs. They are anticipating the difficult conversation when they explain that the art of the possible probably means a delivery some time next year.

The people at the back of the room look thoughtful. They are scribbling on pieces of paper, and exchanging notes and ideas with each other. They do not regard the code that they have seen demoed as a product; they do not even regard it as a prototype. At best, it is a child’s drawing of what the product might be when it is finished. The real work has not even started: ‘just’ turning it into a production application betrays a fundamental misunderstanding of what this work is.

These people - the architects and engineers - don’t mind (although they often talk as if they mind a lot). Turning half-formed ideas into working products is what they are here to do. Furthermore, they are here to address those under-appreciated capabilities which I have previously referred to as meta-requirements, but which we could also describe as interesting engineering problems. Over the next few weeks, I’ll explore some of these problems.

The first interesting problem is the most basic of all: survival. The enthusiastic person running the demo, who imagines that turning the prototype into a product just involves a few people bashing away in a code editor, almost certainly does not appreciate the environment into which they will push their newly formed app.

Those people who have been solving interesting engineering problems for most of their careers have learnt to treat every production environment as implacably hostile, as a world in which you cannot trust anything or anybody.

Some of the reasons for this are obvious: the Internet, on which most applications live these days (and even those applications which are not directly exposed to the Internet are almost always connected to it somehow), is full of bad actors who deliberately attempt to steal, compromise or disrupt. Networks we believe to be private also contain insider threats - often from people who don’t even realise that they have been compromised by a scam or by malware.

However, even if we can anticipate and adapt to these deliberate threats, most of the risks we face do not result from conscious action at all; they are the result of dumb luck. Anyone who has worked with computer hardware and systems software knows one thing: they will fail. At scale, the frequency of failure means that something is failing all the time. Handling failure is not the exception: it is a fundamental characteristic of production systems.

And, if attacks from bad actors and the continuous failure of the environment were not enough, the environment is continuously changing. We have to figure out not just how our new application will interact with a complex world of devices, technologies and other apps - we have to figure out how to ensure it will survive in a world of new devices, new technologies and new standards we know nothing about today.

Sometimes the world into which any application will be thrust seems like an apocalyptic dystopia from a fantasy or science fiction novel. The world is teeming, but nobody can be trusted: they are at best indifferent, and at worst hostile. People will lie, cheat and steal, and subvert your good intentions. Even those people closest to you cannot be trusted: they may have their own agenda, or they may be imposters wearing masks. The very landscape is unreliable, and suffers continuous collapses, quakes and other catastrophes. And, even if you can master this landscape and this society, the environment changes every day and the rules change every day.

If you’re one of the people at the front of the room, impressed by the demo, this may help explain why the people in the middle of the room (the product and project managers) are frowning and the people at the back of the room (the architects and engineers) are looking thoughtful and scribbling. They are trying to figure out how to turn the sketchy picture of a set of features and functions into a tough, adaptive application that will survive in the hostile, shifting environment of production. Perhaps at this point in the demo, it’s an opportunity to go and hear what they have to say, and learn about all of the interesting engineering problems that they have to solve. The good news is that they are here to solve them, and that the problems are genuinely interesting.

Government has some of the most interesting engineering problems in the world. When I wrote this blog post, I was recruiting for the first ever Chief Engineer for the UK government. That role has been filled, but I’d still recommend considering time in government if you are looking for interesting problems to solve.

riskarchitectureprogramming

David Knott

Interesting engineering problem #1: survival

Engineering for a distant future

Words matter (especially when we don’t know what they mean)