Beyond Collaborative Filtering: A Dual-Signal Approach to Recommendation

Author: Denis Avetisyan

A new framework, FlexCode, enhances recommendation accuracy by intelligently combining user behavior and item semantics.

FlexCode constructs generative recommendations through a dual codebook encoding-collaborative and semantic-aligned via contrastive learning, where a popularity-aware Mixture-of-Experts router dynamically allocates resources to these codebooks before an autoregressive Transformer generates item sequences.

FlexCode leverages dual codebooks and adaptive capacity allocation to improve both popular and long-tail item recommendations.

Existing generative recommendation systems, while powerful, often treat all items uniformly, overlooking the differing needs of popular and long-tail content. This limitation motivates the work ‘Semantics Meet Signals: Dual Codebook Representationl Learning for Generative Recommendation’, which introduces FlexCode, a framework leveraging separate codebooks for collaborative and semantic signals with dynamically allocated capacity based on item popularity. Experiments demonstrate FlexCode consistently outperforms strong baselines, improving both overall accuracy and performance on less frequent items. Could this popularity-aware approach offer a more nuanced pathway toward balancing memorization and generalization in token-based recommendation models?

Beyond the Echo Chamber: Addressing Recommendation System Limitations

Collaborative filtering systems, while remarkably effective at recommending widely popular items, often falter when faced with the ‘long tail’ – that extensive collection of products or content with comparatively few interactions. These systems primarily identify patterns based on what many users already like, boosting the visibility of already-popular choices. However, the vast majority of items in many catalogs exist outside this mainstream, receiving minimal attention. This inherent bias means potentially valuable, niche items remain hidden, creating a skewed recommendation landscape and limiting user exposure to diverse options. Consequently, users are frequently presented with a narrow range of familiar choices, while the long tail of less-known, potentially highly relevant items remains largely unexplored, hindering discovery and overall satisfaction.

The prevalence of filter bubbles and limited discovery represents a significant drawback of many recommendation systems. As algorithms prioritize content aligning with established user preferences, individuals are increasingly presented with a narrow range of information, reinforcing existing biases and hindering exposure to novel or diverse perspectives. This phenomenon doesn’t just impact user experience – limiting serendipitous encounters with potentially valuable items – but also poses challenges to item diversity within the system itself. Less popular, but potentially highly relevant, content struggles for visibility, creating a self-perpetuating cycle where only already-popular items gain further prominence, ultimately diminishing the overall richness and variety available to users and stifling the potential for genuine discovery.

Modern recommendation systems are increasingly recognizing the limitations of solely relying on patterns of user-item interaction. While identifying correlations between users and the items they’ve engaged with – the core of collaborative filtering – remains valuable, a truly effective system necessitates a deeper comprehension of the items themselves. This involves incorporating rich metadata – encompassing attributes like descriptive keywords, visual features, or even semantic embeddings derived from textual content – to build a nuanced representation of each item. By moving beyond simple “who bought this also bought that” logic, algorithms can begin to surface items that align with a user’s underlying preferences, even if those items haven’t been widely adopted, thereby fostering serendipitous discovery and mitigating the formation of restrictive filter bubbles. This richer understanding allows for recommendations based on what an item is, not just who has shown interest, potentially unlocking the vast potential of the long-tail distribution and significantly enhancing user satisfaction.

FlexCode: A Dual Representation for Nuanced Recommendations

FlexCode employs a dual codebook architecture to disentangle user-item interaction patterns from intrinsic item characteristics. This is achieved by maintaining two separate codebooks: a collaborative codebook and a semantic codebook. The collaborative codebook represents high-order interaction features derived from user behavior, effectively capturing patterns of co-occurrence and preference. Concurrently, the semantic codebook encodes modality-specific attributes of items, providing a representation grounded in item content rather than solely on interaction data. This separation allows for specialized representation of both collaborative and semantic information, enabling the model to leverage distinct aspects of user-item relationships.

The FlexCode architecture employs two distinct codebooks to represent user-item interactions. The collaborative codebook focuses on capturing high-order interaction patterns, meaning it learns relationships beyond simple user-item pairings by considering combinations of interactions. This allows the model to identify complex preferences. Concurrently, the semantic codebook encodes modality-grounded item attributes, representing inherent characteristics of items based on their content or metadata. These attributes are modality-grounded, indicating that they are derived from the specific type of item data – for example, genre for movies or author for books. This separation enables the model to represent both how users interact with items and what the items are, providing a richer representation than traditional methods.

By differentiating between item co-occurrence and underlying attributes, the FlexCode model moves beyond simplistic recommendation strategies. Traditional collaborative filtering methods often suggest items solely based on their frequent appearance together in user interactions; FlexCode, however, integrates semantic information about the items themselves. This allows the model to identify relevance based on shared characteristics, even if those items haven’t been frequently co-accessed. Consequently, recommendations are not limited to popular pairings, resulting in a more diverse set of suggestions and an increased potential for surfacing items users might not otherwise discover, while maintaining a higher degree of relevance to user preferences.

Dynamic Capacity Allocation: Prioritizing Resources for Optimal Performance

FlexCode utilizes a Mixture-of-Experts (MoE) architecture to dynamically allocate capacity between multiple codebooks, each representing either collaborative or semantic information. This allocation isn’t static; instead, it’s driven by ‘popularity-aware allocation’, which assesses item interaction frequency. The MoE mechanism routes requests to the most appropriate codebook based on this popularity signal. Consequently, frequently interacted-with items are primarily served by codebooks emphasizing collaborative filtering, capitalizing on abundant interaction data. Conversely, less frequent, or ‘sparse’, items are directed towards codebooks focused on semantic representation, allowing for generalization from limited interaction data. This dynamic adjustment optimizes resource allocation across the entire catalog, avoiding uniform capacity distribution and improving overall performance.

FlexCode dynamically allocates capacity to codebooks based on item popularity and data availability. Items with high interaction frequency are assigned greater collaborative capacity, allowing the model to effectively utilize the abundant data to refine their representations. Conversely, items with limited interaction data – those in the long tail – are allocated increased semantic representation capacity. This prioritizes leveraging pre-trained knowledge and inherent item characteristics to create meaningful embeddings despite sparse interaction signals, ensuring a balanced and efficient use of model resources across the entire catalog.

Dynamic capacity allocation, as implemented in FlexCode, optimizes resource utilization across the entire item catalog by prioritizing computational resources towards items with higher interaction frequency while simultaneously supporting those with limited data. This approach improves overall performance by ensuring that frequently accessed items benefit from robust collaborative filtering, while less popular, or “long-tail,” items receive increased attention from the semantic representation component. This balanced allocation prevents resource starvation for sparse items, enabling improved recommendations and representation learning for the complete item set, rather than solely optimizing for popular items.

Cross-codebook alignment is achieved through the InfoNCE (Noise Contrastive Estimation) objective, a contrastive learning approach. This process maximizes the mutual information between the collaborative and semantic representations learned by separate codebooks. Specifically, the InfoNCE loss encourages the model to identify positive pairs – corresponding representations of the same item across both codebooks – while discriminating against negative pairs formed by representations of different items. The objective function computes a similarity score between representations, typically using a dot product, and then applies a softmax function to normalize these scores, effectively learning a probability distribution over the positive and negative pairs. This alignment facilitates knowledge transfer between the two representations, improving the overall quality and coherence of the learned embeddings and enhancing performance, particularly for items with limited collaborative data.

From Prediction to Anticipation: Modeling User Behavior as a Sequence

The conventional approach to recommendation systems often focuses on predicting a user’s overall preference, but FlexCode introduces a paradigm shift by framing the task as one of sequential prediction. Instead of simply determining what a user likes, the system aims to forecast what a user will interact with next, treating user behavior as a series of actions rather than isolated events. This is achieved by modeling user histories as sequences and employing techniques from natural language processing – specifically, conditional sequence generation – to predict subsequent items. By learning the patterns and dependencies within these sequences, FlexCode can anticipate a user’s future needs with greater accuracy, moving beyond simple preference matching to a more dynamic and context-aware recommendation strategy. This approach allows the system to capture the temporal evolution of user interests and deliver recommendations that align with their current trajectory.

The core of this approach lies in treating a user’s interaction history as a sequence, much like predicting the next word in a sentence. By employing an auto-regressive model, the system learns to forecast future interactions based on the patterns observed in past behavior. This technique, rooted in sequence generation, effectively captures the temporal dependencies inherent in user choices – the understanding that a purchase made today is often influenced by items viewed yesterday, or even weeks prior. Instead of solely focusing on immediate preferences, the model considers the entire chronology of interactions, allowing for more nuanced and accurate predictions of what a user might engage with next. This capability is particularly valuable in dynamic environments where user interests evolve over time, and offers a significant improvement over methods that treat each interaction in isolation.

Rigorous evaluation of the proposed model utilizes standard recommendation metrics, namely Recall@K and Normalized Discounted Cumulative Gain@K (NDCG@K), to quantify its predictive accuracy. These metrics assess the model’s ability to not only retrieve relevant items but also to rank them in order of relevance, mirroring the nuanced demands of real-world recommendation systems. Comparative analysis consistently reveals substantial performance gains over established baseline models, such as SASRec, demonstrating the efficacy of the approach. Specifically, improvements are observed across various datasets and at different values of K, indicating a robust and generalizable enhancement in recommendation quality – a critical step toward providing users with more satisfying and personalized experiences.

Evaluations on a large-scale industrial dataset reveal that FlexCode significantly outperforms existing recommendation systems. The model achieves a 13.2% improvement in Normalized Discounted Cumulative Gain at rank 10 (NDCG@10), a metric assessing the ranking quality of recommendations, and a 16.5% improvement in Hit Rate at rank 10 (HR@10), which measures the proportion of users for whom the correct item appears within the top 10 recommendations. These gains demonstrate FlexCode’s capacity to provide more relevant and accurate suggestions, effectively addressing the challenges of predicting user preferences within a real-world application and establishing a new benchmark for sequence-based recommendation performance.

Evaluations across diverse datasets reveal FlexCode’s robust performance and generalizability. On the KuaiRand dataset, a benchmark for evaluating recommendation algorithms, FlexCode achieved an 8.0% improvement in Normalized Discounted Cumulative Gain at 10 (NDCG@10), indicating a substantial enhancement in ranking quality. Further demonstrating its adaptability, the model also exhibited a 5.3% relative improvement in Recall@10 on the Amazon-Sports dataset, showcasing its ability to effectively identify relevant items within a large and varied catalog – a critical metric for user satisfaction and engagement.

A significant strength of FlexCode lies in its ability to enhance recommendations for infrequently interacted-with items, often referred to as the “long-tail.” On a large-scale industrial dataset, FlexCode achieved an 11.3% improvement in Normalized Discounted Cumulative Gain at 10 (NDCG@10) specifically for these tail items. This indicates a substantial advancement in delivering relevant and diverse recommendations beyond popular choices, addressing a common challenge in recommendation systems where the majority of interactions concentrate on a small number of items. By more effectively predicting user interest in less-frequent items, FlexCode offers a more comprehensive and personalized user experience, moving beyond simply suggesting what is already popular and instead surfacing potentially valuable, yet previously overlooked, content.

The pursuit of effective recommendation systems, as demonstrated by FlexCode, necessitates a rigorous examination of underlying principles. The framework’s dual codebook approach-separating collaborative and semantic information-echoes a holistic design philosophy. As Claude Shannon observed, “The most important thing in communication is to convey the meaning, not just the signal.” FlexCode embodies this sentiment by prioritizing not simply what items are popular, but why-leveraging semantic embeddings to understand item characteristics. This careful partitioning and adaptive capacity allocation, particularly for the long-tail, reflects a system designed for clarity and optimized for meaningful information retrieval, rather than simply maximizing superficial metrics. The result is a structure where behavior-recommendation accuracy-naturally follows from thoughtful design.

Beyond the Code: Charting Future Directions

The introduction of separate codebooks, as demonstrated by FlexCode, reveals a fundamental truth about recommendation systems: information is not monolithic. Treating collaborative and semantic signals as distinct entities, while dynamically adjusting their relative influence, suggests a move away from generalized architectures. However, the inherent limitations of codebook approaches – the potential for information loss during quantization, the sensitivity to initial codebook construction – demand further scrutiny. The adaptive capacity allocation, while effective, raises the question of optimal allocation strategies; simply reacting to popularity feels… insufficient. A truly elegant solution would anticipate item relevance, not merely respond to observed behavior.

Future work must address the interplay between these codebooks. The current framework appears to treat them as largely independent. Yet, it stands to reason that semantic understanding should inform collaborative filtering, and vice versa. Exploring mechanisms for cross-pollination – perhaps through attention mechanisms or shared latent spaces – could unlock further gains. Moreover, the generalization to truly novel items – those lacking both collaborative and semantic signals – remains a challenge. A system’s robustness is, after all, defined by its performance at the extremes.

Ultimately, the pursuit of generative recommendation isn’t about creating ever-more-complex algorithms. It’s about distilling the essence of human preference. A successful framework will not merely predict what a user might like, but why they might like it. Such understanding requires a holistic view, recognizing that a recommendation system is not an isolated entity, but a component of a larger information ecosystem.

Original article: https://arxiv.org/pdf/2511.20673.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

Beyond the Echo Chamber: Addressing Recommendation System Limitations

FlexCode: A Dual Representation for Nuanced Recommendations

Dynamic Capacity Allocation: Prioritizing Resources for Optimal Performance

From Prediction to Anticipation: Modeling User Behavior as a Sequence

Beyond the Code: Charting Future Directions

See also: