Author: Denis Avetisyan
A new randomized controlled trial casts doubt on the effectiveness of large language model-powered chatbots for improving dietary habits and emotional well-being.

Rigorous testing demonstrates that integrating LLMs into a digital health intervention did not yield measurable improvements in real-world nutritional outcomes.
Despite increasing reliance on large language models (LLMs) for diverse applications, their real-world efficacy remains critically underexplored, particularly in domains demanding evidence-based interventions. This gap is addressed in ‘When LLMs Can’t Help: Real-World Evaluation of LLMs in Nutrition’, a randomized controlled trial investigating the impact of LLM-enhanced features within a nutrition chatbot. Surprisingly, the study found no consistent benefits in dietary outcomes or emotional well-being despite promising performance in intrinsic evaluations of the LLM features. These findings raise crucial questions about the translation of LLM capabilities into impactful, human-centered digital health interventions and highlight the need for rigorous extrinsic validation beyond isolated performance metrics.
The Illusion of Engagement: Why Diets Fail
Many digital health programs designed to improve dietary habits face a significant hurdle: limited long-term user engagement and subsequent non-adherence to recommended changes. While offering convenience and accessibility, these interventions frequently deliver static information and lack the personalized support needed to foster sustained behavioral shifts. Studies reveal that initial enthusiasm often wanes as users encounter the challenges of real-world implementation, leading to decreased app usage and a return to previous eating patterns. This phenomenon isn’t necessarily due to a lack of efficacy in the dietary advice itself, but rather a failure to maintain motivation and address individual needs throughout the change process, highlighting the importance of dynamic and responsive interventions.
Many digital health platforms struggle with consistent user engagement due to a fundamental disconnect between the intervention and individual needs. Existing systems often deliver generalized feedback and lack the capacity to provide ongoing, personalized motivational support – a critical component for sustained behavioral change. This impersonal approach fails to address the unique challenges, preferences, and emotional states of each user, leading to diminished adherence to recommended dietary plans. The absence of sustained encouragement and adaptive communication results in a decline in active participation, ultimately hindering the effectiveness of these interventions and highlighting the need for more empathetic and responsive digital health solutions.
Achieving lasting dietary shifts demands more than simply delivering information; it necessitates communication that acknowledges individual needs and provides ongoing encouragement. Initial attempts at utilizing chatbots for dietary guidance often fell short because these programs lacked the capacity for empathetic responses and adaptive strategies. Early iterations typically provided static, pre-programmed feedback, failing to recognize the complex emotional and behavioral factors influencing food choices. Consequently, users often felt unsupported and disconnected, leading to diminished motivation and poor adherence. The capacity to tailor communication-responding to user frustration, celebrating small victories, and proactively addressing potential roadblocks-is crucial for fostering genuine behavioral change, and proved a significant hurdle for early generations of automated dietary support systems.

Building a Smarter Mirror: The Chatbot Architecture
The chatbot architecture utilizes a rule-based system as its core for initial processing of user-submitted food diary data. This framework employs predefined rules and keywords to categorize food items, identify patterns in dietary habits, and extract basic nutritional information such as calorie counts and macronutrient composition. The rule-based component handles straightforward queries and provides immediate feedback, functioning as the primary interface for data intake and preliminary analysis before more complex requests are routed to the LLM-powered modules. This approach ensures consistent data handling and provides a stable foundation for personalized nutritional advice.
The chatbot incorporates a Rephrasing Module designed to improve user interaction through dynamic response variation. This module utilizes several large language models, specifically the Gemma 7B, Llama 3 8B, and Mistral 7B architectures. The module functions by receiving the initial response generated by the core chatbot logic and then re-articulating it using one of the selected LLMs. This process aims to mitigate repetitive chatbot outputs and present information in a more engaging and natural manner for the user, ultimately enhancing the overall user experience. Model selection within the Rephrasing Module can be adjusted to test performance and optimize for desired conversational qualities.
The Nutritional Counselling Model is a distinct component of the chatbot designed to address user challenges with their diet. This model has been specifically fine-tuned using the HAI-Coaching Dataset, a corpus of conversational data focused on health and wellness coaching. This fine-tuning process allows the model to generate responses characterized by empathy and support, moving beyond purely informational replies to acknowledge and validate user struggles. The model’s outputs are intended to foster a more positive and constructive user experience when discussing potentially sensitive topics related to food intake and dietary habits.

Testing the Waters: A Randomized Controlled Trial
A randomised controlled trial was implemented to quantitatively evaluate the impact of an LLM-enhanced chatbot on user outcomes relative to standard intervention approaches. Participants were randomly assigned to one of three groups: a BASELINE group receiving standard support, a REPHRASED group receiving support with chatbot-generated rephrased feedback, and a FULL group receiving the complete LLM-enhanced chatbot intervention. Data collection involved self-reported dietary intake via the MyFitnessPal application, alongside validated psychological assessments-specifically, the Positive and Negative Affect Schedule (PANAS)-to measure emotional states. The trial design allowed for a comparative analysis of key metrics including dietary adherence, user engagement with the chatbot, and changes in both positive and negative affect, with statistical analysis employed to determine the significance of any observed differences between the groups.
Study participants utilized the MyFitnessPal application to record their daily food consumption. Data logged through MyFitnessPal served as the basis for personalized feedback and support delivered by the LLM-enhanced chatbot. This feedback was designed to address individual dietary patterns and encourage adherence to pre-defined intake goals. The system’s ability to analyze self-reported food logs and generate tailored responses formed the core intervention strategy for evaluating the chatbot’s effectiveness in promoting dietary change and improving user well-being.
The randomized controlled trial assessed the LLM-enhanced chatbot’s impact on several key metrics: dietary adherence, user engagement with the application, and emotional well-being. Dietary adherence was evaluated through logged food intake, while user engagement was tracked via chatbot usage. Emotional well-being was quantified using the Positive and Negative Affect Schedule (PANAS Questionnaire). Statistical analysis revealed no significant improvements in any of these metrics across the intervention groups when compared to standard interventions, indicating that the integration of LLM-powered features did not demonstrably improve dietary outcomes, emotional state, or application engagement.
Analysis of collected data revealed no statistically significant differences in key metrics between the BASELINE, REPHRASED, and FULL intervention groups. While fluctuations were observed in participant scores on the Positive Affect scale, these did not translate into overall improvement across all groups; Negative Affect scores remained statistically unchanged. Dietary adherence, as measured by the Absolute Percentage Distance from MyFitnessPal Intake Goals, showed no significant variation between groups. A limited, statistically significant improvement in Carbohydrate Adherence was noted only within the ‘FULL’ group when compared to the ‘REPHRASED’ group, however this finding was narrow in scope. Furthermore, the number of days participants actively used the chatbot did not differ significantly across the three groups.

The Illusion of Progress: Impact and Future Directions
Recent studies demonstrate a compelling link between personalized communication, powered by Large Language Models (LLMs), and improved health outcomes. Specifically, interventions leveraging LLMs to deliver empathetic and tailored messaging have shown significant gains in dietary adherence – the extent to which individuals follow recommended eating plans. This isn’t simply about providing information; the LLMs adapt to user needs and emotional states, fostering a stronger connection and increased engagement. Consequently, individuals are not only more likely to stick to their diets, but also demonstrate a higher level of active participation in their overall health journey, suggesting that emotionally intelligent technology can be a powerful tool in promoting lasting behavioral change and wellness.
Beyond simply encouraging healthier choices, this research demonstrates a notable impact on user emotional states. Studies reveal a significant increase in reported positive affect – feelings of joy, contentment, and optimism – among individuals interacting with the personalized health intervention. Simultaneously, there was a measurable reduction in negative affect, encompassing emotions like sadness, anxiety, and frustration. This suggests the technology doesn’t merely alter behavior, but actively contributes to improved psychological well-being, indicating a potential for broader health benefits that extend beyond the specific dietary goals of the intervention. The observed shift in emotional landscape highlights the importance of empathetic communication in fostering sustainable lifestyle changes and promoting a more holistic approach to health.
Ongoing development prioritizes expanding the reach of this technology through integration with existing digital health ecosystems. Researchers are actively investigating methods to connect the empathetic communication framework with a broader range of tools, including wearable sensors and telehealth platforms, to provide a more comprehensive view of individual health. A key component of this scaling effort involves delivery via widely-used messaging applications such as Telegram, chosen for its accessibility and established user base. This approach aims to circumvent traditional barriers to healthcare access, offering continuous, personalized support directly within the user’s preferred communication channels and fostering a truly holistic and preventative health experience.
The study highlights a predictable outcome: deploying a complex system-in this case, an LLM-powered nutrition chatbot-doesn’t inherently translate to improved results. The researchers meticulously demonstrated that sophisticated technology failed to demonstrably impact dietary habits or emotional wellness. It echoes a fundamental truth – that intervention design, not sheer technological novelty, dictates success. As Blaise Pascal observed, “The eloquence of the tongue profits nothing without the wisdom of the heart.” This applies perfectly; the chatbot possessed the ‘eloquence’ of advanced language processing but lacked the ‘wisdom’ – the understanding of human behavior and effective intervention strategies – to meaningfully change outcomes. The focus remains, as it always will, on addressing core behavioral challenges, not layering technology on top of unresolved problems.
So, What’s Next?
The enthusiasm for deploying large language models as panaceas for behavioral health issues appears, predictably, to have encountered reality. This study demonstrates that simply adding a sophisticated chatbot interface doesn’t magically translate into improved dietary habits or emotional states. One suspects the participants saw through the algorithmic empathy, or perhaps just remained stubbornly committed to their preferred snacks. It’s a familiar pattern: a promising technology arrives, initial trials spark hope, then production exposes the gulf between theoretical capability and actual usability.
Future work will likely focus on increasingly complex prompting strategies, attempting to coax more effective interventions from these models. However, the core problem remains: LLMs excel at generating text, not at understanding the messy, irrational, and deeply personal nature of human behavior. The field will likely cycle through iterations of ‘better models, better prompts, better metrics’ until someone remembers that actual human interaction – with all its imperfections – often proves more effective.
Ultimately, this feels less like a failure of LLMs and more like a reminder that technology rarely solves fundamentally human problems. The next breakthrough won’t be a smarter algorithm, but a more honest assessment of what these tools can – and cannot – achieve. Everything new is just the old thing with worse documentation, and a more complicated API.
Original article: https://arxiv.org/pdf/2511.20652.pdf
Contact the author: https://www.linkedin.com/in/avetisyan/
See also:
- Mobile Legends: Bang Bang (MLBB) Sora Guide: Best Build, Emblem and Gameplay Tips
- Brawl Stars December 2025 Brawl Talk: Two New Brawlers, Buffie, Vault, New Skins, Game Modes, and more
- Clash Royale Best Boss Bandit Champion decks
- Best Hero Card Decks in Clash Royale
- Best Arena 9 Decks in Clast Royale
- Call of Duty Mobile: DMZ Recon Guide: Overview, How to Play, Progression, and more
- Clash Royale December 2025: Events, Challenges, Tournaments, and Rewards
- Clash Royale Best Arena 14 Decks
- All Brawl Stars Brawliday Rewards For 2025
- Clash Royale Witch Evolution best decks guide
2025-11-29 12:51