Big data! Fast data! Real-time analytics! These are buzzwords commonly associated with platform offerings around IoT.
Although the Law of large numbers always applies, just because you can deploy more sensors doesn’t automatically mean that you should. After all, they cost money, bandwidth, and can be a pain to maintain.
A regular speaker at Thingmonk Boris Adryan has a long-standing interest in big data analytics and machine learning. He presented a logarithmic history of his progression from the academic who lead a method development group at the University of Cambridge believing it was important to build the best IoT ontology that money could buy. Then as a consultant and hired hand for IoT companies who complained that if you spent money on machine learning it had better not be rude or annoying. To corporate Boris who did his own personal Brexit this year, joining Zühlke advising on optimal sensor deployment patterns and analytics strategies to maximise profits.
Boris argued that although statisticians and data scientists LOVE larger sample sizes if sampling costs time and resources, we need a compromise. It seems that many participants are worried about the upfront investment for an industrial IoT solution. So Boris brought a few recommendations from the trenches and insights from a project with OpenSensors on ways to cut down on hardware and software costs thereby sweetening IoT for the customer.
Referring to the example of the Westminster Parking Trial, which demonstrated how analytics on preliminary survey data could have reduced the number of deployed sensors significantly. A similar logic goes for fast and real-time analytics. While being advertised as killer features, many people new to IoT and analytics are not even aware that they might get away with batch processing. Using the example of flying a drone he discussed appropriate use cases for edge processing on the drone, stream or micro-batch analytics when data arrives at the platform and work on batched data stored in a database.
Concluding with an unabashed criticism of unreflected use of deep learning when other methods would be more appropriate. Reminding us that faster analytics on bigger and better hardware are not automatically the most useful solution and that a good understanding on the type of insight that is required by the business model is essential.