Core Concepts
The AdaptIoT system proposes a novel software architecture to enable adaptive machine learning applications in cyber manufacturing environments through interactive causality-enabled self-labeling.
Abstract
The paper proposes the AdaptIoT system, a cyber manufacturing IoT platform that supports adaptive machine learning applications through interactive causality-enabled self-labeling.
The key highlights are:
AdaptIoT defines an end-to-end IoT data streaming pipeline that supports high throughput (≥100k msg/s) and low latency (≤1s) sensor data streaming. It also provides a standard interface to integrate various machine learning applications.
The most important feature of AdaptIoT is its inherent support for self-labeling, which manages computational models (e.g., machine learning models) to automatically execute a flexible self-labeling workflow. This allows the models to adapt and personalize to counter data distribution shifts without human intervention.
AdaptIoT incorporates a causality knowledge base to store and manage the virtual interactions among computational models for self-labeling. It employs a scalable microservice architecture that can easily integrate future capabilities such as data shift monitoring.
The authors deploy AdaptIoT in a small-scale makerspace and develop a self-labeling adaptive machine learning application, demonstrating the applicability and adaptive capabilities of the system in real-world environments.
Stats
The system can support up to 13.2k time-series edge services with an average data ingestion rate of 1259 messages per second.
A single Kafka producer can achieve a maximum throughput of 182k messages per second.
Quotes
"The merit of the self-labeling method is in its ability to fully leverage the unique properties of ML applications in CPS contexts, including scenarios with rich domain knowledge, dynamic environments with time-series data and possible data shifts, and diverse environments with limited pre-allocated datasets to fulfill the needs of personalized solutions at the edge."
"To support and execute the interactive causality based self-labeling (SLB) method, especially for SMMs, the system infrastructure must support the following requirements: 1) real time timestamped data transfer of sensor, audio, and video data from from heterogeneous services and devices; 2) a causality knowledge base that manages the interaction between models to facilitate self-labeled ML between causally related nodes. 3) a core self-labeling service that connects the ML services, routes data streams, executes the self-labeling workflow, and retrains and redeploys ML models autonomously at the edge; 4) a scalable architecture to easily accommodate new edge, ML, and SLB services."