- agents
- AI
- ambiguity
- architecture
- augmented reality
- books
- bureaucracy
- career
- change
- Christmas
- cloud
- collaboration
- communication
- compliance
- corporate life
- data
- decisions
- delivery
- devops
- disagreement
- end user tools
- ethics
- failure
- fear
- fundamentals
- government
- halloween
- history
- humans
- hype
- identity
- inclusion
- infrastructure
- innovation
- language
- leadership
- learning
- legacy
- management
- measurement
- mental health
- money
- networking
- New Year
- operations
- philosophy
- physics
- platforms
- prediction
- privacy
- process
- procurement
- products
- programming
- quantum
- reliability
- resilience
- risk
- science fiction
- security
- shadow IT
- space
- strategy
- talent
- teaching
- teams
- technical debt
- technology advocacy
- testing
- thinking
- transformation
- TV
- virtues
- vision
- writing
Not just someone else’s computer: to understand cloud, go back to the start
Do you believe that the cloud is just somebody else’s computer? If so, then I have to disagree with you.
I could say that this is because a computer is just a machine, whereas a cloud is a fully architected, software defined, API managed platform, that is at least as much software as it is hardware. However, I think that we can find a more interesting answer by going back to the origins of cloud.
Most on-premise computing architectures have grown organically over decades, and include principles, patterns, components and capabilities from other eras. The code running on that mainframe at the heart of your estate may have been written before it was normal for systems to run across large numbers of machines in parallel. The processes to procure, configure and manage servers may have been created before we realized that dev and ops belong together as a shared set of accountabilities. The security measures that protect your assets may have been implemented before it was the standard to connect most of your systems to a global public network.
Do you need an umbrella or a lifeboat?
How do you prepare for things that just keep on going wrong? And how do you prepare for the day when everything goes wrong?
Last week I attempted to distinguish between reliability and resilience, claiming that reliability is the ability to keep services running despite routine failures, while resilience is the ability to restore essential services despite unexpected catastrophes. But that basic definition is not quite enough to disentangle these related but distinct topics. In this article, I’ll explore three more important differences between reliability and resilience.
Reliability protects normality; resilience strives for survival
Do you know the difference between reliability and resilience?
If you want to know the difference between reliability and resilience, look to the Moon. Specifically, look at the two best known Moon missions, Apollo 11 and Apollo 13.
Although Apollo 11 was famously successful, this was almost not the case. In the final minutes of the descent, the guidance computer crashed repeatedly, throwing error after error at the two astronauts, who waited tensely on instructions from Mission Control, and wondered whether to abort the mission.
Later, it was found that the computer was receiving unexpectedly large amounts of data from one of the instruments, overloading its memory and processing capacity. Yet, despite the nerve wracking series of errors, the computer behaved exactly as designed. When overwhelmed, it displayed an error message and restarted itself, giving priority to the most important programmes.