The paper presents a novel genetic algorithm-based approach for automating the selection of materialized views in data warehouses. The key highlights are:
Encoding: The algorithm represents potential materialized views as bits in a binary string, enabling efficient application of standard genetic operators like crossover and mutation.
Initial Population: A pilot study is conducted to evaluate a random subset of materialized view configurations, and the top-performing ones are used to seed the initial population, providing a good starting point.
Selection Function: Lexicase selection is used to choose parents, considering performance on individual test cases rather than aggregate fitness, which helps maintain population diversity.
Crossover: A localized multi-parent blend crossover technique is employed, blending only the differing genes between parents to reduce computational overhead while preserving beneficial subsequences.
Fitness Function: A customizable multi-objective fitness function is designed, allowing flexible normalization, shaping, and prioritization of the competing objectives of minimizing response time, maintenance cost, and memory usage.
Mutation: An adaptive mutation rate is used, which dynamically adjusts the mutation probability based on the population diversity, helping to balance exploration and exploitation.
The proposed approach is evaluated using the TPC-H benchmark dataset, and the results demonstrate significant improvements over state-of-the-art materialized view selection techniques. The genetic algorithm-based framework outperforms existing methods by 11% in average execution time and 16 million in total materialized view costs, highlighting its effectiveness in enabling performant and cost-effective utilization of materialized views in enterprise data warehousing systems.
Til et andet sprog
fra kildeindhold
arxiv.org
Vigtigste indsigter udtrukket fra
by Mahdi Manavi kl. arxiv.org 04-01-2024
https://arxiv.org/pdf/2403.19906.pdfDybere Forespørgsler