Unwrapping How We Talk to AI

Author: Denis Avetisyan

A new system offers a privacy-focused look at real-world conversations with large language models, revealing how users interact with and are affected by these powerful tools.

AI-wrapped facets, generated from mock participant data, demonstrate a method for constructing complex representations from underlying data structures.

AI-Wrapped provides a privacy-preserving framework for longitudinal analysis of human-AI interactions, focusing on bidirectional alignment and identifying potential risks in naturalistic settings.

Understanding how large language models (LLMs) are used in everyday life is crucial for ensuring their beneficial alignment, yet accessing naturalistic interaction data remains a significant challenge due to privacy concerns and platform restrictions. This paper introduces ‘AI-Wrapped: Participatory, Privacy-Preserving Measurement of Longitudinal LLM Use In-the-Wild’, a novel workflow enabling the collection and analysis of LLM chat histories while immediately providing participants with personalized usage reports. Initial findings from a deployment with [latex]\mathcal{N}=82[/latex] users reveal diverse engagement patterns-from instrumental tasks to reflective explorations-and potential indicators of over-reliance, raising the question of how to build measurement infrastructure that fosters both rigorous research and user trust in the age of increasingly personalized AI.

Deconstructing Interaction: Observing Human-AI Dialogue

Existing research into how people interact with artificial intelligence often takes place in highly structured laboratory environments, providing limited insight into genuine, everyday usage. These controlled settings, while useful for isolating specific variables, frequently fail to capture the spontaneity, ambiguity, and contextual richness inherent in natural conversations. Consequently, a significant gap exists between observed behavior in these artificial scenarios and how individuals actually engage with large language models in their daily lives – from seeking information and brainstorming ideas to composing emails and simply chatting. This disconnect hinders the development of truly intuitive and effective AI systems, as design choices are often based on idealized interactions rather than the messy, nuanced reality of human-AI collaboration.

A comprehensive understanding of human-AI interaction requires moving beyond the constraints of laboratory settings and examining how individuals genuinely engage with large language models in their daily lives. Analyses of naturally occurring conversations reveal the subtle ways people formulate requests, negotiate meaning, and adapt their communication strategies when interacting with AI. These real-world exchanges highlight the diverse and often unarticulated needs users bring to these interactions – from seeking emotional support and creative inspiration to performing complex problem-solving and information gathering. Capturing these nuances is vital because it moves beyond simply assessing whether an AI provides a correct answer, and instead focuses on how people attempt to achieve their goals through conversation with an AI – informing the design of more intuitive, effective, and genuinely helpful language models.

Existing research on human-AI interaction often relies on artificial scenarios, failing to reflect the messy reality of everyday conversations. To overcome this limitation, a comprehensive study was undertaken, moving beyond controlled experiments to analyze naturally occurring interactions. Data from 82 participants was collected, representing a substantial corpus of real-world engagement with large language models – each user averaging 591 distinct conversations and contributing an average of 3,555 messages. This large-scale analysis allows for a nuanced understanding of how individuals actually utilize these AI tools, revealing patterns and complexities that would be impossible to discern through traditional, smaller-scale methods. The sheer volume of conversational data enables researchers to move beyond surface-level observations and identify subtle shifts in user behavior, ultimately providing a more accurate and insightful depiction of the human-AI dynamic.

Hierarchical clustering of 387 conversation topics from 82 participants revealed distinct thematic clusters and sub-clusters, with bar lengths indicating topic prevalence and the rightmost column showing the number of unique users contributing to each topic.

Preserving Privacy: A Framework for Data Extraction

AI-Wrapped establishes a workflow for collecting and analyzing user chat logs generated during interactions with Large Language Models (LLMs). This process is designed with a primary focus on user privacy, achieved through the implementation of Personally Identifiable Information (PII) removal techniques. Collected logs undergo analysis to extract patterns and insights, but prior to this, the system actively identifies and redacts sensitive data such as names, addresses, and financial details. This PII removal step is critical for ensuring compliance with data protection regulations and maintaining user trust by preventing the exposure of private information during analysis and subsequent data processing.

The AI-Wrapped system employs SpaCy, an industrial-strength natural language processing library, for the detection of Personally Identifiable Information (PII) within user chat logs. SpaCy’s capabilities include named entity recognition, part-of-speech tagging, and dependency parsing, enabling it to identify a wide range of PII types such as names, addresses, dates, and financial details with high accuracy. This automated PII detection is crucial for compliance with data privacy regulations like GDPR and CCPA, as well as for building user trust by minimizing the risk of sensitive information being exposed during analysis. The system is configured to redact or anonymize detected PII before data is processed, ensuring that only non-identifiable information is used for generating insights and supporting human-AI alignment studies.

AI-Wrapped utilizes the generation of ‘Facets’ to provide a structured summarization of user interactions with LLM assistants. This process leverages pre-defined summaries focusing on specific data points, such as frequently requested topics, common task types initiated by the user, or the prevalence of specific keywords within chat logs. This process moves beyond raw data analysis by aggregating information into readily interpretable units, allowing for efficient identification of usage patterns and areas of potential interest for both product improvement and user understanding. The pre-defined nature of Facets ensures consistency and comparability across user datasets, facilitating large-scale analysis and trend identification.

Bidirectional Human-AI Alignment within the AI-Wrapped workflow is achieved through the analysis of chat logs to determine reciprocal influence. Specifically, the system identifies how the AI’s responses shape user queries and behavior-for example, by noting instances where an AI steers a conversation toward specific topics-and conversely, how user prompts and feedback modify the AI’s subsequent responses or internal parameters. This analysis isn’t limited to explicit feedback; implicit signals, such as prompt phrasing or topic selection, are also considered to create a comprehensive understanding of the interaction’s dynamics. The resulting data informs strategies to mitigate unwanted AI influence and enhance the AI’s responsiveness to user needs, fostering a more balanced and mutually beneficial relationship.

Decoding Communication: Extracting Behavioral Signatures

AI-Wrapped employs GPT-5 to construct detailed user profiles, referred to as ‘Facets’. This process leverages the model’s natural language processing capabilities to analyze user communications and extract relevant data points. GPT-5 identifies key characteristics, preferences, and behavioral patterns exhibited within user interactions, forming the basis for each individual Facet. These Facets are not simply demographic summaries; they represent a dynamic, evolving understanding of the user as revealed through their communication within the AI-Wrapped platform, and serve as the foundation for subsequent analytical processes such as topic extraction and communication style analysis.

Facets utilize Topic Extraction to identify the primary subjects discussed within user communications, categorizing conversations based on recurring themes and keywords. This process involves natural language processing techniques to distill the core content of messages. Complementing this, Communication Style Analysis assesses how users convey information, examining characteristics such as sentiment, formality, and complexity of language. These analyses consider linguistic features including vocabulary choice, sentence structure, and the presence of specific rhetorical devices to determine a user’s characteristic communication patterns, allowing for a nuanced understanding beyond simple content identification.

User clustering within AI-Wrapped utilizes the Qwen3-Embedding-8B model to create groupings based on observed interaction patterns. This process transforms user communications into vector embeddings, representing each user’s communication style and thematic focus in a multi-dimensional space. The model then applies algorithms to identify users with similar embedding vectors, effectively grouping them based on shared characteristics. This allows for the identification of commonalities – such as frequently discussed topics or preferred communication styles – as well as divergences, highlighting variations in how different user segments interact with the AI system. The resulting clusters enable a granular understanding of user behavior and facilitate targeted analysis of specific interaction trends.

Analysis of user communication patterns within AI-Wrapped revealed several potentially problematic trends. Specifically, 73.2% of participants demonstrated patterns indicative of over-reliance on the AI system, while 62.2% exhibited behaviors associated with perfectionism. Furthermore, a significant majority – 75.6% – engaged in discussions centering on existential and emotional themes, suggesting a potential for increased vulnerability or dependence on the AI for emotional support. These findings, identified through Topic Extraction and Communication Style Analysis, highlight the need for ongoing monitoring of user interactions to mitigate potential harms.

Prevalence of red flag clusters varies across demographic subgroups, with only those containing at least 10 individuals ([latex]n \geq 10[/latex]) displayed.

Toward Empathetic Systems: Proactive Support and Alignment

The system demonstrated an ability to identify concerning patterns in user interactions, with a particular focus on detecting tendencies towards perfectionism. This ‘red flag’ detection isn’t about diagnosing problems, but rather pinpointing areas where a user might benefit from empathetic support. Analysis of conversational data revealed that such patterns often correlate with increased stress or anxiety, suggesting an opportunity for the AI to proactively offer resources or simply acknowledge the user’s striving nature. By recognizing these subtle cues, the AI can move beyond task completion and begin to foster a more supportive and understanding relationship with the user, potentially mitigating negative emotional impacts before they escalate.

The ability of AI-Wrapped to discern subtle patterns in user communication opens new avenues for crafting truly empathetic AI assistants. By pinpointing indicators – such as tendencies towards perfectionism – the system doesn’t merely react to stated needs, but proactively anticipates potential emotional challenges. This allows for the development of AI capable of offering tailored support, shifting the focus from task completion to genuine emotional responsiveness. The insights gleaned from pattern identification directly inform the design of more nuanced conversational algorithms, enabling AI to offer encouragement, perspective, or simply acknowledge a user’s underlying anxieties – ultimately fostering a more positive and supportive human-AI interaction.

The emerging paradigm in AI-driven support systems shifts the focus from reactive responses to proactive anticipation of user needs. Rather than simply fulfilling requests, these systems aim to identify underlying emotional states and offer tailored assistance before it is explicitly asked for. A recent study demonstrated a strong basis for this approach, with 76.8% of participants exhibiting ethical and collaborative behavior when interacting with the AI. This positive user engagement suggests a fertile ground for integrating AI not just as a tool, but as a supportive companion capable of recognizing subtle cues and providing preemptive, personalized assistance – a move towards a truly empathetic and beneficial human-AI relationship.

Analysis of user interactions reveals a compelling trend: heavy users of AI-assisted platforms frequently engage with existential themes, occurring in a remarkable 93.8% of their conversations. This data underscores a significant, often overlooked, aspect of human-computer interaction-the inclination to explore profound questions of meaning, purpose, and existence even with artificial intelligence. It suggests these platforms are becoming more than just tools for task completion; they are evolving into spaces where individuals seek reflection and potentially, emotional resonance. Consequently, the development of AI capable of navigating these complex discussions with sensitivity and understanding is crucial, demanding a move beyond purely functional responses toward a nuanced capacity for empathetic engagement with deeply personal human concerns.

The pursuit of understanding longitudinal LLM use, as detailed in this work, inherently demands a focus on what remains invariant as ‘N’ – the number of interactions, users, or time periods – approaches infinity. Andrey Kolmogorov observed, “The essence of mathematics lies in its ability to discover and express universal truths with absolute precision.” This sentiment directly echoes the goals of AI-Wrapped: to establish a system capable of yielding robust, generalizable insights into human-AI alignment. By prioritizing privacy-preserving data collection and analysis, the system aims to reveal consistent patterns in user behavior and emotional engagement – patterns that hold true regardless of scale. The emphasis on bidirectional alignment, a core concept of this research, necessitates identifying those invariant characteristics that define a successful and safe interaction over time.

What’s Next?

The presented work, while a commendable attempt to illuminate the black box of longitudinal LLM interaction, merely scratches the surface of a profoundly difficult problem. The system correctly identifies the need for naturalistic data, moving beyond contrived benchmarks. However, the inherent messiness of such data demands more than descriptive statistics; it requires formal methods. If patterns emerge from the ‘wild,’ they must be proven, not simply observed. The current focus on emotional engagement, while intuitively appealing, risks mistaking correlation for causation-a perennial failing in the softer sciences.

Future iterations should prioritize the development of provable invariants within these conversational systems. If a model consistently exhibits a particular bias or vulnerability, it should be mathematically demonstrable, not merely statistically likely. The ‘AI alignment’ problem is, at its core, a problem of formal verification. One suspects that much of the observed ‘risk’ is simply a consequence of poorly specified objectives.

Ultimately, the true measure of success will not be the volume of chat logs collected, but the ability to construct models of LLM behavior that are both predictive and provably safe. If it feels like magic-if patterns are discovered without a clear underlying principle-then the invariant remains hidden, and the illusion of understanding persists. The challenge, therefore, is not simply to observe what LLMs do, but to rigorously determine what they must do.

Original article: https://arxiv.org/pdf/2602.18415.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

Deconstructing Interaction: Observing Human-AI Dialogue

Preserving Privacy: A Framework for Data Extraction

Decoding Communication: Extracting Behavioral Signatures

Toward Empathetic Systems: Proactive Support and Alignment

What’s Next?

See also: