7 am in the morning. Some notifications in the mailbox and on Slack.
Big demo today and part of the system had gone down.
And had gone back up too. With no human intervention.
On closer examination, there were some wrinkles pending (we’ve learnt of them today). But on the whole, the system had responded great to an unplanned problem.
Yes, I know for sure that it’s nothing extraordinary. That’s what the cloud, availability zones, auto-scaling groups, clusters, or containers are for.
And yes, there are teams much more advanced on that. I cannot even compare to Amazon, Google or Netflix.
But software that anticipates and reacts to problems is now commonplace.
So now take a step back and think for a moment.
We’re building software that fixes itself.
First published here