As a software architect for the Microsoft Azure cloud platform Clemens Vasters knows the suffering involved in building very large scale back end systems. In this off the cuff talk he shares his pain; describing exactly how difficult handling back office computation at scale really is.
Clemens’s enthusiasm for aviation was evident, not only in his choice of t-shirt but also in the example he chose to lend some historical insight to his talk at ThingMonk. The first IoT system IBM SAGE (Semi-Automatic Ground Environment), this state of the art air defence system built during the cold war illuminates the awesome computing power and physical storage needed to operate large scale systems.
Talking about connected car Clemens points out that at present the difficulties are so great that either limited connectivity is offered or client populations are kept purposefully low by reserving connectivity as expensive optional extras. He sees maintaining the device to cloud connection from roaming addressees as one of the first key significant problems of scale in IoT.
The cost of connectivity is another issue, its not enough to be able to handle a million M2M connections over time. In the event of system denial attacks you need to be able to reboot all those connections simultaneously.
In the end Clemens proved that operational reliability is relative in mass scale computing revealing that their eventhubs are currently running at 99.99976% reliability. Which sounds pretty good until you consider they are handling 3.8 trillion transactions /week which translates to 6.8 million/second. Clemens claims that this level of traffic gives life, that these distributed systems are like animals. Ultimately even with 99.99999…going on to up to 10 figures percentage accuracy they will still be throwing hundreds of thousands of errors. The mind absolutely boggles and the ubiquitous ‘ would you like to send an error report’ takes on a whole new conceptual place.