Conceitos essenciais
This paper introduces a novel approach for identifying frequent subgraph patterns by combining two frequently occurring smaller subgraph patterns, and proposes a new metric based on Maximal Independent Sets to efficiently enumerate pattern graphs within a data graph.
Resumo
The paper presents the FLEXIS framework for frequent subgraph mining, which makes the following key contributions:
-
Generation Step:
- FLEXIS generates candidate π-vertex patterns by merging two frequently occurring (πβ1)-vertex patterns. This approach is more efficient than existing methods that rely on edge or vertex extension.
- The merging process handles challenges such as maintaining meaningful connectivity, handling vertex/edge labels, and ensuring uniqueness of merged patterns.
-
Metric Step:
- FLEXIS introduces a new metric called mIS (Maximal Independent Set) that retains the accuracy of the gold-standard MIS metric while providing faster computation times comparable to the approximate MNI metric.
- mIS allows users to control the trade-off between accuracy and processing time by adjusting a slider parameter π. This provides flexibility to tailor the metric to the needs of different applications.
-
Experimental Evaluation:
- FLEXIS achieves an average 10.58x speedup compared to GraMi and an average 3x speedup compared to T-FSM, while maintaining comparable or better accuracy.
The paper first provides background on graph mining concepts and existing metrics. It then details the FLEXIS approach, including the candidate pattern generation and the mIS metric. Finally, it presents extensive experimental results demonstrating the efficiency and effectiveness of the proposed method.