Reasoning first from principles: our journey towards building the most advanced edge intelligence platform for Industrial IoT
By Abhi Sharma, Head of Analytics and Machine Learning, FogHorn Systems
FogHorn has won numerous awards and recognition from leading technology analysts and firms like Gartner and Frost and Sullivan for its pioneering vision and product superiority. Why is that? Well, it’s because we are pushing the technology envelope much further than any other company in the Industrial IoT space. At the very core of our vision is enabling Edge Analytics -> Edge Machine Learning -> Edge AI with the ultimate goal of creating a truly ubiquitous distributed intelligence platform that lives within edge devices, machines and gateways—closer to the source of data rather than in a centralized cloud. All this while still seamlessly exchanging newly learned model weights, metadata and insights with the cloud as needed. And finally, the key to our product success being that Foghorn achieves these technical advances while keeping the product very operational technologist (OT) centric. It is actually usable and easily programmable by domain experts across many different industries and domains.
As the IoT paradigm gains stronger roots in the market (it’s not just a buzzword anymore), devices are getting smarter and we are seeing an exponential rise in available sensor data of various kinds. That essentially means data processing needs are shifting significantly too. The edge where the data is being produced needs to get much more sophisticated yet still remain lightweight and nimble. Therefore, the solution is not as trivial as just bringing the centralized cloud computing technologies designed for distributed powerful data-centers down to the edge or just over-simplifying edge computing needs by offering merely data collection and forwarding to the cloud. The essence of getting this next-generation intelligence paradigm right, lies in thinking about the product from first principles and then engineering that into it.
If you look at IIoT industries like manufacturing, oil and gas, mining, transportation, health-care, power, water, renewables, smart cities/airports, transportation, etc., all of them contribute to billions of dollars of business. Now look at all these businesses from a technology-focused lens and you will quickly notice that they have not adopted cloud computing and the modern analytics and machine learning platforms like some of other IT-focused companies across the world. Well, why is that? Obviously it is no secret that being able to leverage all the data that gets produced in these industries can lead to massive business outcomes, efficiency and savings. Given all this, why can’t these industries simply take advantage of using technologies like Google, Facebook, Amazon etc. where the whole competitive landscape relies on data driven insights? The answer lies in thinking about the problem from fundamental grassroots-level truths, the business constraints and thus the missing products and tools. At FogHorn, we have been relentlessly reasoning from these basic first principles to create the most advanced edge intelligence platform and precisely solving this problem for the IIoT sector. This is the first segment of a four-part blog series where I review all of these core fundamental grassroots truths and business constraints for IIoT and showcase how we engineered an advanced edge intelligence platform that fits right in.
Bandwidth, storage costs, latency and compute requirements don’t allow for a centralized cloud processing model for IoT.
A single elevator instrumented with just a few sensors generates about ~2.6 GB of raw data per day. An average sized modern manufacturing plant can generate up-to 1 TB of raw data in just half a week. If you think of camera feeds as streams of data, the numbers are even more mind-boggling. I can go on with endless examples, but if you add these numbers up over a month, and then a year, the costs of bandwidth and storage cannot possibly justify a centralized cloud model. With the ever increasing rate of data streams, it is cost prohibitive to move all of it to the cloud. Not to mention, a large percentage of raw data is not very useful anyway.
To top the bandwidth and storage problem. Many use cases demand low latencies for very non-trivial complex analytics. More often than not, this makes the round-trip to the cloud prohibitive, both from a cost and response time perspective. If my million-dollar machinery crashes, immediately affecting yields, it is too late for me to learn that later in the day. And unless a surgery room and infrastructure are perfectly maintained, monitored and orchestrated in real time, it will negatively affect human lives.
The answer is obvious: if the business insights are all going to be data-driven across various sources of real-time data, then there is an unprecedented need to have a system for intelligence right at the source where the data is being generated.
Engineering the platform
Before data can be processed, the obvious first question is, how do we ingest the multitude of data from the various heterogeneous sources that are producing it at very high rates? To process the data, we need to ingest efficiently (catering to possible memory footprint constraints at the edge) and ingest fast (catering to latency constraints) in order to offer a singular accessible view of all the continuous sensor streams.
Understanding the above, we designed most of our data ingestion for standard IoT protocols (like OPCUA, MODBUS, MQTT etc.) and video/audio with efficient wire encoding-decoding strategies and data sharing/exchange technology from scratch in C++. Our super thin, uber fast data broker perfectly balances speed and reliability in-context of reactive stream processing software with well-defined APIs and encoding/decoding mechanisms. This offers a seamless way to tap into live streaming data and combine it in interesting ways (also referred as sensor fusion) to run applications and non-trivial analytics right at the edge.
The explicit intent to get into these layers of system implementation was to carefully hand craft the platform for low-latency data multiplexing even at high throughput rates. It involves careful high performance programming catering to explicit awareness of code in relation to memory usage and CPU caches, using special lock-free programming techniques and data structures for optimal performance. The typical well-known heavyweight brokers in the cloud-computing environment cannot possibly fit into the footprint and latency constraints of IoT environments.
Without a strong foundational data plane it’s almost impossible to satisfy the bandwidth, storage and latency requirements at the edge for various target hardware platforms. Therefore we engineered this into the platform first. This allows us to ingest data from various protocols and of various kinds and to seamlessly offer it up for analytics and machine learning.
We continuously conduct our performance benchmarks with thousands of sensors, publishing as fast as a few milliseconds, which comfortably covers most use cases at the edge.
To summarize, in this blog post we talked about the fundamental value of edge computing as a paradigm, the necessity to ingest and process high speed heterogeneous data right at the edge, and why current cloud-based technologies will falter if they are retro-fitted as edge-solutions.
Next Segment—Part 2
In the upcoming part two of this series, learn how (once the data is ingested) Foghorn enables analytics right at the edge by offering full-blown complex event processing (CEP) and advanced analytics capabilities that are actually usable by anyone with IoT problem statement.