TDWI Articles

Technology Trends for Streaming Analytics in 2021

To tackle 2021's biggest problems, look to these three key areas to advance in the New Year.

In 2020 we repeatedly observed the need to improve the complex, geographically distributed logistics systems that provide the services on which we all depend. This has been a key driver of technology advances in streaming analytics that will make them more effective, efficient, and secure. Logistics systems that deliver goods, medicines, and energy, as well as transportation systems, need to avoid unexpected delays while simultaneously boosting their efficiency to lower costs. Delays in manufacturing ventilators and personal protective equipment and then delivering them to hospitals during the COVID-19 crisis have affected healthcare.

For Further Reading:

A New Approach to Streaming Analytics That Can Track a Million Data Sources

Streaming Data and Message Queuing in the Enterprise

Q&A: Cutting-Edge Analytics Technologies Are At the Edge

In 2021, it will be imperative to make sure that vaccines are distributed in a reliable, timely manner and that all parts of the distribution network (such as refrigeration) work seamlessly. Our disaster-response systems, power distribution systems, and even voting systems are repeatedly stressed by security intrusions and by increasingly numerous and violent weather events due to climate change, among many other challenges.

Here are some key areas in which we expect streaming analytics technology to advance in 2021 to help address these needs.

Trend #1: Interconnected, intelligent devices

In 2021, we expect to see a widespread expansion in the interconnection of devices within all these systems so their telemetry can be tracked and analyzed in real time. This will enable managers to do a much better job identifying and resolving issues as they occur as well as spotting emerging issues that require fast, strategic responses. Although the promise of IoT is still in its infancy, the need for devices to intelligently communicate vital telemetry to management systems is growing more urgent. Advances in communication technologies, such as 5G cellular networks and the Starlink array of satellites, will help catalyze this important capability.

Examples of the demand for intelligent, interconnected devices abound. For instance, the electrical power industry needs to add sensors on all power poles and other key nodes within its network. By doing so, it can spot overheating components and failures before they impact service or start forest fires. When fires do start, managers will have much better real-time information about the scope and direction of the fire, enabling quicker, more effective evacuation and fire-fighting strategies.

We are now seeing the introduction of device intelligence in smart warehouses (for example, for pallets). Interconnecting warehouse logistics systems with telematics systems for trucking fleets will help managers improve overall on-time performance and efficiency. As critical medical devices (such as ventilators) gain the intelligence to communicate their location and condition, crisis managers will be able to immediately locate the nearest devices that match urgent needs and strategically track their availability in real time.

Trend #2: Digital twins

A key enabling software technology that helps streaming analytics applications manage the torrent of incoming telemetry from interconnected devices is the concept of the "digital twin." Although digital twins have been around for many years and applied effectively in the product development process, their application to analyzing streaming telemetry is still a novel concept.

We expect that to change in 2021 because digital twins offer two compelling advantages: deeper introspection and simplified application design. When every data source in a vast, interconnected network of devices has its own digital twin, streaming analytics applications can track key contextual information about each data source to help interpret incoming telemetry. For example, although one truck engine might report elevated oil temperature that is normal because the engine is ageing and due for service, another, newer engine suddenly reporting the same oil temperature might indicate an unexpected problem. Having context enables deeper introspection, and digital twins make that possible while providing a streamlined organizational framework for the streaming analytics application.

Trend #3: Machine learning everywhere

As we enable the countless devices in our infrastructure to beam telemetry to our analytics systems, we expect to see the integration of machine learning into telemetry pipelines. This will dramatically boost our ability to separate signal from noise in signaling new issues and predicting future ones. Does a specific door's pattern of opening and closing within an office building or factory indicate a security threat or a normal daily occurrence? Do voltage and current changes in a node within a power transmission network fit within expected excursions for that node or do they indicate an abnormal condition that might lead to overheating?

The key to implementing effective streaming analytics for each data source within a large population is to incorporate domain-specific machine learning algorithms that can introspect on incoming telemetry with contextual information about the data source to provide effective alerting. This can be accomplished by encapsulating machine learning algorithms within digital twins. Their combined use improves the quality of alerting over time as the digital twin receives telemetry and the algorithm learns more about the behavior of the corresponding data source. These "machine learning twins" will form crucially important components of streaming analytics platforms that track large numbers of interconnected devices.

Machine learning twins will also provide a vital first level of filtering for the enormous volumes of incoming telemetry so that downstream analytics can aggregate its results to create an overall picture of a complex, dynamically evolving physical system. This will enable managers to evaluate curated data and make better strategic decisions that lead to more effective action. For example, managers of a geographically distributed power grid can use the results generated by machine learning twins to more accurately assess whether multiple security threats at different locations represent a coordinated attack.

Having domain-specific, machine learning algorithms digest incoming telemetry as it arrives from numerous interconnected devices will be important new technology for streaming analytics as it grapples with the complexity of our logistics, security, power, disaster recovery, and other vital real-world systems.

 

About the Author

Dr. William L. Bain is the founder and CEO of ScaleOut Software. In his 40-year career, Bain has focused on parallel computing and has contributed to advancements at Bell Labs Research, Intel, and Microsoft. He has founded four software companies and holds several patents in computer architecture and distributed computing. You can reach the author via email, Twitter, or LinkedIn.


TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.