Large Language Model

Entrar

insight - Large Language Model

JetMoE-8B: Achieving Llama2 Performance with Only $0.1 Million

JetMoE-8B, a new 8B-parameter Large Language Model (LLM), demonstrates impressive performance while being trained with less than $0.1 million, outperforming the larger Llama2-7B and Llama2-13B-Chat models.

Enabling User-Centered Constraints on Large Language Model Outputs for Practical Applications

Applying user-defined constraints on the format and semantics of LLM outputs can streamline prompt-based development, integrate LLMs into existing workflows, satisfy product requirements, and enhance user trust and experience.

Investigating the Reliability of Large Language Model Responses through Self-Consistency Analysis

Leveraging the self-consistency of multiple language model samples to assess the reliability and factuality of generated text.

Incremental Utility: A Novel Approach to Improve Few-Shot In-Context Learning with Large Language Models

Introducing a novel method called "incremental utility" to estimate how much additional knowledge a demonstration brings to a large language model for few-shot in-context learning tasks, and showing its effectiveness compared to previous utility estimation approaches.

Scaling Mixture-of-Expert Large Language Models: Balancing Performance and Inference Efficiency

Mixture-of-Expert (MoE) language models can scale model size without increasing training cost, but face challenges in inference efficiency. This work studies the optimal training budget allocation for MoE models by incorporating both model performance and inference cost as key metrics.

Efficient Prompt-Prompted Mixture of Experts for Large Language Model Generation

A novel training-free Mixture of Experts (MoE) method called GRIFFIN that selects unique feedforward experts at the sequence level to enable efficient generation across a variety of large language models with different non-ReLU activation functions, while preserving the original model's performance.

LLM ATTRIBUTOR: Interactive Visual Attribution for Analyzing and Improving Large Language Model Generations

LLM ATTRIBUTOR is a Python library that provides interactive visualizations to help LLM developers understand and improve the training data attribution of their models' text generation.

Integrating Citation Mechanisms to Enhance Transparency and Accountability in Large Language Models

Incorporating a citation mechanism in large language models can enhance content transparency, verifiability, and accountability, addressing intellectual property and ethical concerns.

Sobre

Produtos | Recursos

Insights