toplogo
Sign In

Distributed Process Discovery: EdgeAlpha Brings Process Mining to the Data Sources


Core Concepts
EdgeAlpha is a distributed algorithm for process discovery that operates directly on sensor nodes and edge devices, eliminating the need for centralized event logs and enabling scalable and privacy-preserving process mining.
Abstract
The paper introduces EdgeAlpha, a distributed algorithm for process discovery based on the Alpha Miner. EdgeAlpha operates directly on sensor nodes and edge devices, processing events as they are generated and maintaining partial footprint matrices (FMs) locally. This approach eliminates the need for centralized event logs and enables scalable and privacy-preserving process mining. Key highlights: EdgeAlpha tracks each event and its predecessor and successor events directly on the sensor node where the event is recorded, building a partial FM. When a process model is requested, the partial FMs are merged at a central location to compute the final process model. EdgeAlpha reduces communication overhead by prioritizing queries to the most frequently occurring predecessors (MFP Requesting) and by batching predecessor queries. Experiments show that EdgeAlpha can reduce the communication for determining predecessor events by up to 96% compared to querying all network nodes, and can further reduce the average number of queried nodes per event to less than 2.5% of all nodes by batching queries. EdgeAlpha ensures data privacy by only storing aggregated event data on the nodes and exchanging partial FMs with the central entity.
Stats
The Hospital Log dataset contains 150,291 events, 1,143 cases, and 624 distinct activities. The Sepsis Cases dataset contains 15,214 events, 1,050 cases, and 16 distinct activities. The Smart Factories dataset contains 8,607 events, 271 cases, and 21 distinct activities. The BPI Challenge 2017 dataset contains 1,202,267 events, 31,509 cases, and 26 distinct activities. The Road Traffic Fine Management Process dataset contains 561,470 events, 150,370 cases, and 11 distinct activities.
Quotes
"EdgeAlpha enables (a) scalable mining, as a node, for each event, only interacts with its predecessors and, when queried, only exchanges aggregates, i.e., partial footprint matrices, with the central location and (b) privacy preserving process mining, as nodes only store their own as well as predecessor and successor events." "Our dataset analysis reveals that even for datasets with a substantial number of activities, such as the Hospital Log [9] comprising more than 600 distinct activities, the average count of predecessors per activity remains modest, averaging fewer than 7 predecessors."

Key Insights Distilled From

by Julia Rossow... at arxiv.org 05-07-2024

https://arxiv.org/pdf/2405.03426.pdf
EdgeAlpha: Bringing Process Discovery to the Data Sources

Deeper Inquiries

How can EdgeAlpha be extended to handle dynamic process changes, where the most frequent predecessors of an activity may change over time?

In order to handle dynamic process changes in EdgeAlpha, where the most frequent predecessors of an activity may change over time, a proactive approach can be implemented. One way to achieve this is by incorporating a sliding window mechanism in the Most-Frequent-Predecessor Requesting strategy. By maintaining a fixed window size that tracks recent predecessor activities, nodes can adapt to changes in the process dynamics. This approach allows nodes to leverage recent data for predictions, enabling them to adjust to evolving patterns in the process flow. Additionally, nodes can prioritize the most recent data within the window to make more accurate predictions about the most frequent predecessors of an activity at any given time. By continuously updating the window with new data and adjusting the querying strategy based on the evolving process behavior, EdgeAlpha can effectively handle dynamic process changes.

How could the concept of EdgeAlpha be applied to other domains beyond process mining, where distributed data processing and privacy preservation are important considerations?

The concept of EdgeAlpha, with its focus on distributed data processing and privacy preservation, can be applied to various domains beyond process mining where similar considerations are crucial. One such domain is IoT (Internet of Things) networks, where EdgeAlpha can be utilized to perform real-time data analysis and processing directly on IoT devices. By implementing EdgeAlpha in IoT environments, data can be processed locally on the edge devices, reducing the need for centralized data storage and enhancing data privacy. This approach is particularly beneficial in scenarios where sensitive data is generated by IoT devices and needs to be processed securely and efficiently. Furthermore, EdgeAlpha's distributed algorithm can be adapted for use in healthcare systems to enable privacy-preserving data analysis and processing. By deploying EdgeAlpha in healthcare settings, patient data can be processed locally on medical devices or edge servers, ensuring data privacy and security compliance. This approach can facilitate real-time monitoring of patient health data while maintaining confidentiality and privacy. Additionally, EdgeAlpha can be applied in smart manufacturing environments to analyze production data locally on manufacturing equipment or edge servers. By processing data at the edge, manufacturing companies can improve operational efficiency, reduce latency, and enhance data security. The distributed nature of EdgeAlpha allows for scalable and efficient data processing in complex manufacturing systems while preserving data privacy. Overall, the concept of EdgeAlpha can be extended to various domains beyond process mining, including IoT, healthcare, and smart manufacturing, where distributed data processing and privacy preservation are critical requirements. By leveraging EdgeAlpha's decentralized approach, organizations can achieve efficient data analysis, enhanced privacy protection, and improved data security in diverse application areas.
0
visual_icon
generate_icon
translate_icon
scholar_search_icon
star