Core Concepts
Large language models, when fine-tuned on extensive datasets, can achieve competitive performance in high-dimensional regression tasks for predicting electromagnetic spectra of metamaterials.
Abstract
This study investigates the potential of large language models (LLMs), such as ChatGPT, in predicting the electromagnetic spectra of metamaterials. The key findings are:
LLMs, when fine-tuned on large datasets (e.g., 40,000 samples), can outperform conventional machine learning approaches, including deep neural networks, in terms of Mean Absolute Relative Error (MARE) across all dataset sizes explored.
The performance of the fine-tuned LLM (FT-LLM) improves significantly as the dataset size increases, narrowing the gap with neural networks in terms of Mean Squared Error (MSE). This suggests that LLMs can effectively leverage extensive training data to capture complex patterns in the geometry-spectrum relationship.
The impact of temperature settings on the LLM's performance is related to the dataset size. In data-constrained scenarios, moderate randomness can improve the quality of the output, while in data-rich environments, lower temperature settings are more conducive to minimizing the MSE.
The prompt design, whether using a concise vector representation or a detailed textual description, does not significantly influence the predictive accuracy of the fine-tuned LLM.
While the FT-LLM demonstrates promising results in forward prediction, its performance in inverse design tasks remains limited. The model often generates physically implausible or invalid outputs when asked to design metamaterial geometries to achieve a desired spectrum.
Overall, this study highlights the potential of LLMs as powerful tools for scientific exploration, particularly in the domain of metamaterials research. The findings suggest that fine-tuning LLMs on large datasets can enable them to grasp the nuances of the physics underlying metamaterial systems, making them valuable for tasks such as forward prediction. However, further research is needed to address the challenges in leveraging LLMs for inverse design problems.
Stats
The all-dielectric metasurface is defined by a 14-dimensional vector: [height, periodicity, semi-major axis, semi-minor axis, and rotation angle for each of the four elliptical resonators].
Quotes
"Large language models (LLMs) like generative pre-trained transformers (GPTs) have recently emerged as a foundational model primarily designed to handle natural language processing tasks."
"By harnessing vast amounts of text data, these models learn to predict the next word in a sentence, thus acquiring an ability to construct coherent and contextually relevant text."