The paper presents a novel genetic algorithm-based approach for automating the selection of materialized views in data warehouses. The key highlights are:
Encoding: The algorithm represents potential materialized views as bits in a binary string, enabling efficient application of standard genetic operators like crossover and mutation.
Initial Population: A pilot study is conducted to evaluate a random subset of materialized view configurations, and the top-performing ones are used to seed the initial population, providing a good starting point.
Selection Function: Lexicase selection is used to choose parents, considering performance on individual test cases rather than aggregate fitness, which helps maintain population diversity.
Crossover: A localized multi-parent blend crossover technique is employed, blending only the differing genes between parents to reduce computational overhead while preserving beneficial subsequences.
Fitness Function: A customizable multi-objective fitness function is designed, allowing flexible normalization, shaping, and prioritization of the competing objectives of minimizing response time, maintenance cost, and memory usage.
Mutation: An adaptive mutation rate is used, which dynamically adjusts the mutation probability based on the population diversity, helping to balance exploration and exploitation.
The proposed approach is evaluated using the TPC-H benchmark dataset, and the results demonstrate significant improvements over state-of-the-art materialized view selection techniques. The genetic algorithm-based framework outperforms existing methods by 11% in average execution time and 16 million in total materialized view costs, highlighting its effectiveness in enabling performant and cost-effective utilization of materialized views in enterprise data warehousing systems.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Mahdi Manavi lúc arxiv.org 04-01-2024
https://arxiv.org/pdf/2403.19906.pdfYêu cầu sâu hơn