Language Model Optimization

insight - Language Model Optimization

Kernel Looping on Reconfigurable Dataflow Architectures for Accelerated Language Model Inference

Kernel looping, a novel compiler optimization technique, significantly enhances the inference performance of large language models on reconfigurable dataflow architectures by eliminating synchronization overheads and maximizing memory bandwidth utilization.

Improving Language Models Through End-to-End Planner Training

Jointly fine-tuning a high-level planner with a low-level language model, using a novel soft-selection method for action embeddings, improves language modeling performance, particularly perplexity.

Addax: A Memory-Efficient Optimization Algorithm for Fine-Tuning Large Language Models

Addax is a novel optimization algorithm designed for fine-tuning large language models (LLMs) that addresses the memory limitations of traditional methods like Adam while achieving faster convergence and better performance than memory-efficient alternatives like MeZO.

모듈형 언어 모델 파이프라인 최적화를 위한 가중치 및 프롬프트 최적화 전략: BetterTogether 알고리즘 소개

대규모 언어 모델 (LLM)을 사용하여 모듈형 자연어 처리 (NLP) 시스템을 최적화할 때, LLM 가중치 미세 조정과 프롬프트 최적화를 결합한 BetterTogether 전략이 각 방법을 개별적으로 사용하는 것보다 성능이 크게 향상됩니다.

Alternating Prompt Optimization and Fine-Tuning Improves Language Model Pipelines

Alternating between prompt optimization and fine-tuning (BetterTogether approach) significantly improves the performance of modular language model pipelines across various NLP tasks.

Smart: Scaling Down Language Models for Reduced Processing Fees

Introducing Smart, a framework to minimize inference costs of Large Language Models while ensuring accuracy guarantees.

About

Products

Resources