Sign In

Enhancing Efficiency and Effectiveness of LLM-Based Click-Through Rate Prediction with Long Textual User Behaviors

Core Concepts
BAHE, a novel hierarchical encoding approach, decouples the representation extraction of atomic behaviors from the learning of behavior interactions, significantly improving the efficiency and effectiveness of LLM-based CTR prediction with long user sequences.
The paper proposes the Behavior Aggregated Hierarchical Encoding (BAHE) method to address the efficiency bottleneck of large language models (LLMs) when processing long textual user behavior sequences for click-through rate (CTR) prediction. Key highlights: The efficiency bottleneck arises from the redundant encoding of identical user behaviors across different sequences and the tight coupling between behavior representation extraction and behavior interaction modeling. BAHE introduces a hierarchical architecture that decouples these two components. Firstly, BAHE employs the pre-trained lower layers of the LLM to extract embeddings of atomic user behaviors and stores them in an offline database. This converts the encoding from token-level to behavior-level, substantially reducing sequence length. Subsequently, BAHE utilizes the deeper, trainable layers of the LLM to model the interactions between the retrieved atomic behavior embeddings, generating comprehensive user representations. This separation allows the learning of high-level user representations to be independent of low-level behavior encoding, significantly reducing computational complexity. Extensive experiments show that BAHE reduces training time and memory usage by 5 times compared to traditional LLM-based CTR models, while also improving the overall CTR performance. BAHE has been successfully deployed in a real-world industrial CTR prediction system, enabling daily model updates on 50 million data using 8 A100 GPUs.
The dataset contains around 50 million CTR records collected over a week, with 6 text features like user bills, searches, and mini-program visits, along with item titles. Each user sequence has 50 user behaviors, averaging 5 tokens each, summing up to 10 million atomic behaviors.

Key Insights Distilled From

by Binzong Geng... at 03-29-2024
Breaking the Length Barrier

Deeper Inquiries

How can the BAHE approach be extended to handle dynamic changes in user behaviors, where new atomic behaviors are constantly introduced

To handle dynamic changes in user behaviors where new atomic behaviors are constantly introduced, the BAHE approach can be extended by implementing a mechanism for incremental updates to the behavior embedding table. This mechanism would involve periodically retraining the lower layers of the LLM with new data to incorporate the latest atomic behaviors. By updating the behavior embedding table with these new behaviors, BAHE can adapt to changes in user behavior patterns over time. Additionally, a caching strategy can be employed to efficiently manage the storage and retrieval of atomic behaviors, ensuring that the system remains responsive even with a growing number of behaviors.

What are the potential limitations of the hierarchical encoding strategy, and how could it be further improved to handle more complex user-item interactions

The hierarchical encoding strategy in BAHE may have potential limitations in capturing complex user-item interactions that require more nuanced representations. To address this, the strategy could be further improved by incorporating attention mechanisms that allow the model to focus on relevant parts of the user behavior sequences. By introducing attention mechanisms at different levels of the hierarchy, BAHE can learn to weigh the importance of different behaviors and interactions, leading to more accurate and context-aware representations. Additionally, incorporating graph neural networks or relational modeling techniques can enable BAHE to capture intricate relationships between users, behaviors, and items, enhancing its ability to model complex interactions in recommendation tasks.

Given the success of BAHE in CTR prediction, how could the principles of decoupling representation extraction and interaction modeling be applied to other recommendation tasks beyond CTR, such as sequential recommendation or knowledge-aware recommendation

The principles of decoupling representation extraction and interaction modeling in BAHE can be applied to other recommendation tasks beyond CTR, such as sequential recommendation or knowledge-aware recommendation, by adapting the hierarchical architecture to suit the specific requirements of these tasks. For sequential recommendation, the hierarchical encoding can be extended to capture temporal dependencies between user interactions, enabling the model to make predictions based on the sequence of events. In knowledge-aware recommendation, the hierarchical structure can be modified to incorporate external knowledge graphs or ontologies, allowing the model to leverage domain-specific information for improved recommendations. By customizing the architecture and training process of BAHE for different recommendation tasks, it can be effectively applied to a wide range of scenarios beyond CTR prediction.