Small Data Going Big in 2021 and Other Tipping Points to Watch
It's been quite a year, with many tipping points triggered by COVID-19. Here are three trends that tipped over and will grow rapidly in 2021.
- By Rado Kotorov
- December 16, 2020
Tipping points turn current tendencies into major trends that fundamentally change what or how we do things. Tipping points are easy to identify in retrospect and hard to recognize in advance. Take, for example, the sharing economy. People share all the time, but who would have thought that sharing would become the new business model?
Trend #1: Enterprises Recognize the Beauty of Small Data
In 2021, we'll see enterprises transition from learning from big data to learning from small data. This trend recently gained attention due to two articles -- "A radical new technique lets AI learn with practically no data" (MIT Technology Review) and "Small Data Can Play a Big Role in AI"
(Harvard Business Review ). After all that excitement about the potential of big data for AI, why are we focusing now on learning from small data?
Small data has been around us much longer than big data. Companies are realizing that learning from big data is expensive and time-consuming. Machine and deep learning require many examples, many hours of data labeling, and significant effort to train and tune a model.
Using humans to label large volumes of data is not only labor intensive but also prone to errors which impacts the accuracy of the results. Humans do not learn in this way. We learn from just one or two examples. A person needs to see a wine glass and a coffee cup only once to know the difference. Machines need thousands of wine glasses and coffee cups examples to make the distinction. The evolutionary advantage of human beings is a brain that quickly learns how to differentiate objects and instantly decide in "fight or flight" situations, and we need to design algorithms that learn in this way to scale the adoption of AI.
Furthermore, many companies do not have the volumes of historical data to train machines. Lean manufacturers operate at six sigma which means one defect per million. Such low defect rates do not produce enough examples for machine learning. It would be more practical and cost-effective if machines could learn from a sample of one.
Trend #2: Remote Monitoring Becomes Popular
The technology for remote monitoring has existed for some time, but the tipping point occurred in 2020 because of COVID-19. As societies try to minimize exposure to COVID-19, more and more jobs are being performed remotely. We already have seen a huge move towards robotic process automation driven by cost savings.
However, remote monitoring is different. It is necessary because the ultimate decision maker remains human. Remote monitoring can be applied in all industries, from energy generation to patient diagnosis and monitoring. Enabling remote monitoring depends on data and analytics. For example, to deliver remote cardio patients' diagnosis and monitoring at scale, we need to empower medical professionals trained in understanding heartbeats -- but not trained in data science -- to analyze and configure personalized predictive models on each patient's wearable device.
This trend is closely related to the first trend of learning from small data. Every patient's heartbeat patterns are unique, which means every patient is a sample of one and the predictive monitoring model is unique for every patient. A recent research paper by Trendalyze ("A Personalized Monitoring Model for Electrocardiogram (ECG) Signals: Diagnostic Accuracy Study") explains how a physician can leverage a very small sample of heartbeats during the configuration of the wearable device to implement a highly accurate personalized artificial logical network that predicts and alerts physicians about any deterioration of the patient's condition.
Trend #3: Granular Time-Series Data Becomes More Important
With so many sensors installed, it is no wonder that time-series data is growing so swiftly. Some analysts estimate that companies collect nine times more granular data than business data. Granular data delivered in real time through high-speed networks with the arrival of 5G creates opportunities in smart manufacturing, digital health, high-frequency transactional systems and much more.
Granular data is so important because it has diagnostic qualities that allow us to detect issues instantly, predict future events, automate decisions, and trigger instant actions. For example, if you are monitoring a patient's pulse in one-second increments, you won't see a heart attack. However, if you monitor for 180 milliseconds, the shape of the heart attack becomes obvious to the human eye. Such heartbeats are produced by devices that monitor industrial machines and processes, vehicle engines, and electric batteries today but have been underutilized because it is not easy to analyze such a vast amount of granular data. Being able to find interesting patterns and assign meaning to them is like finding golden nuggets. The monetization opportunities are endless because meaningful patterns trigger instant actions that can save lives, prevent machine failures, and capitalize on market opportunities.
These meaningful patterns are like the ad words on Google: once detected, they are instantly monetizable. Today companies are looking for new types of analytics platforms and algorithms to extract such golden nuggets, build libraries of known patterns and monitor them in real time. Hence, new platforms, with new methods of analysis, are emerging in this area. Companies such as Trendalyze (where I serve as CEO), Trendminer, Seeq, Anodot are defining this space and becoming essential new tools in the hands of domain experts and business analysts.
Dr. Rado Kotorov is the CEO of Trendalyze.com. In his career, Dr. Kotorov has created numerous analytics technologies and solutions and has published extensively on the using data and analytics for competitive advantage. You can reach the author via email.