Core Concepts
A method combining Generative Deep Learning and Evolutionary Algorithms to produce realistic and novel audio samples by using the RAVE model as the sound generator and the VGGish model as a novelty evaluator in the Latent Vector Novelty Search (LVNS) algorithm.
Abstract
The paper proposes the LVNS-RAVE method, which combines the strengths of Generative Deep Learning and Evolutionary Algorithms to generate realistic and diversified audio samples.
The key aspects are:
The RAVE model is used as the sound generator, which can produce high-quality audio outputs. The latent vectors of RAVE are used as the genotypes for the evolutionary process.
The Novelty Search algorithm is used to evolve the latent vectors, with the goal of generating diverse and novel audio samples. The VGGish model is used as the novelty evaluator, providing a perceptual distance metric between audio samples.
The evolutionary process involves crossover and mutation of the RAVE latent vectors, with the goal of maximizing the sparseness (novelty) of the generated samples within the container.
Experiments were conducted using three different pre-trained RAVE models (vintage, darbouka_onnx, and VCTK) and four different setups, demonstrating the flexibility and effectiveness of the LVNS-RAVE method in generating diverse and high-quality audio samples.
The results show that the LVNS-RAVE method can successfully generate diversified, novel audio samples under different mutation setups and pre-trained RAVE models. The characteristics of the generation process can be easily controlled with the mutation parameters, making it a promising creative tool for sound artists and musicians.