The article discusses the potential of deploying Large Language Models (LLMs) at the 6G edge to overcome challenges faced in cloud-based deployment. It highlights killer applications like healthcare and robotics control that necessitate LLM deployment at the mobile edge due to latency, bandwidth, and privacy issues. The content delves into technical challenges such as communication costs, computing capabilities, storage requirements, and memory obstacles for LLM deployment. It proposes a 6G Mobile Edge Computing (MEC) architecture tailored for LLMs and explores techniques for efficient edge training and inference. The discussion extends to device-server co-training strategies like split learning and multi-hop split learning to distribute computing workload effectively. Moreover, it addresses efficient large model inference techniques like quantized edge inference and parameter-sharing for reduced latency. Open research problems on green edge intelligence and privacy-preserving edge intelligence for LLMs are also highlighted.
A otro idioma
del contenido fuente
arxiv.org
Ideas clave extraídas de
by Zheng Lin,Gu... a las arxiv.org 03-04-2024
https://arxiv.org/pdf/2309.16739.pdfConsultas más profundas