Easing Robot Interactions for Anxious Individuals

Author: Denis Avetisyan

Researchers detail a new protocol for collecting data on human-robot interactions, specifically designed to understand and address social anxiety.

Figure 1: Overview of the Proposed Methodology The methodology proposes a framework for dissecting complex systems by iteratively challenging established boundaries, ultimately revealing underlying principles through controlled deconstruction and reconstruction-a process formalized as $f(x) = \int_{a}^{b} g(x) dx$.

This work presents a multimodal dataset collection methodology using a Wizard-of-Oz approach to facilitate the development of affect-adaptive robots for mental healthcare.

Despite growing interest in leveraging robotics for mental healthcare, a significant barrier remains in accurately recognizing and responding to nuanced affective states like social anxiety. This paper details a protocol for building a multimodal dataset – ‘Towards Affect-Adaptive Human-Robot Interaction: A Protocol for Multimodal Dataset Collection on Social Anxiety’ – designed to capture physiological, behavioral, and contextual signals associated with social anxiety during human-robot interaction. The proposed dataset, collected from at least 70 participants engaging in Wizard-of-Oz scenarios with the Furhat robot, aims to provide a rich resource for developing robust affect recognition algorithms. Will this data pave the way for truly empathetic and adaptive robots capable of providing personalized support for individuals experiencing social anxiety?

Deconstructing the Anxious Spectrum

Social anxiety represents a widespread condition, extending far beyond simple shyness and significantly impacting individuals across a remarkably diverse range of social situations. Its prevalence is notable, affecting people from all walks of life – students navigating classrooms, professionals in workplace settings, and individuals engaging in everyday interactions. Importantly, the condition doesn’t present as a monolithic entity; rather, it exists on a spectrum, with some experiencing mild discomfort and avoidance, while others face debilitating fear and substantial impairment in their daily functioning. This variability in severity underscores the need for nuanced understanding and tailored interventions, recognizing that the experience of social anxiety is uniquely personal and context-dependent, influencing not only how a person feels, but also their behavioral responses and overall quality of life.

Evaluating social anxiety demands a comprehensive strategy, as the condition isn’t solely defined by reported feelings of distress. Researchers now understand that a person’s subjective experience – their internal narrative and perceived threat – is deeply interwoven with physiological responses, such as increased heart rate, sweating, and cortisol levels. Simultaneously, observable behavioral manifestations – like avoidance of eye contact, difficulty initiating conversations, or trembling – provide crucial data. Effective assessment, therefore, integrates these three dimensions, moving beyond self-reported symptoms to capture the complex interplay between body, mind, and outward behavior. This multifaceted approach allows for a more nuanced understanding of the specific ways social anxiety impacts an individual, paving the way for targeted and effective interventions.

Current methods for evaluating social anxiety frequently depend on individuals detailing their own experiences, a practice inherently susceptible to reporting biases and recall inaccuracies. These self-assessments, while valuable, often struggle to capture the subtle, moment-to-moment fluctuations in anxiety that occur during actual social encounters. Because anxiety manifests not just in conscious thought, but also through physiological responses – such as increased heart rate or subtle facial expressions – relying solely on subjective reports can provide an incomplete picture. Researchers are increasingly exploring supplementary techniques, like behavioral observation and physiological monitoring during simulated social situations, to gain a more comprehensive understanding of how social anxiety unfolds in real-time and to circumvent the limitations of self-reported data.

Unveiling Signals: A Multimodal Approach

The dataset utilized for this research combined physiological signals with observed behaviors to provide a comprehensive view of participant responses. Specifically, data was gathered using the Empatica EmbracePlus, a wearable sensor that captures heart rate variability, electrodermal activity, and body temperature. Simultaneously, behavioral data was collected during interactions with a robotic platform, focusing on observable actions and responses. This integration of physiological and behavioral streams created a multimodal dataset allowing for the analysis of correlated signals indicative of participant states.

The multimodal data collection strategy enabled the capture of multiple anxiety indicators beyond self-reporting. Specifically, heart rate variability (HRV), measured via the Empatica EmbracePlus, provided insights into autonomic nervous system activity associated with anxiety levels. Concurrent analysis of facial expressions, captured through video recordings, identified visual cues such as brow furrowing or lip corner depression. Furthermore, verbal responses, including speech rate, pauses, and sentiment analysis of spoken content, were incorporated to assess linguistic indicators of anxious states. The integration of these physiological and behavioral data streams facilitated a comprehensive assessment of anxiety expression.

The integration of physiological data, specifically heart rate variability as measured by the Empatica EmbracePlus, with behavioral data – encompassing facial expressions and verbal responses during robotic interaction – was undertaken to enhance the sensitivity and objectivity of social anxiety assessment. Traditional methods often rely on self-reporting, which is subject to bias. Combining these data streams enables the capture of anxiety’s nuanced, temporal characteristics as they unfold during a social stimulus, allowing for real-time analysis and potentially revealing subtle indicators of distress not captured by static assessments. This multimodal approach moves beyond categorical diagnoses toward a continuous measurement of anxiety’s dynamic expression, facilitating a more granular understanding of individual responses.

The Robotic Mirror: Simulating Social Challenges

The Furhat robot was employed as a means of presenting consistent social stimuli to participants, specifically designed to elicit responses to scenarios involving social rejection and judgment. This platform allowed for the standardized delivery of these stimuli, controlling for variations in delivery that might occur with human interaction. The robot’s capabilities enabled the presentation of nuanced social cues, ranging from subtle nonverbal behaviors to direct statements of disapproval or exclusion. These interactions were pre-programmed to ensure each participant experienced the same set of social challenges, facilitating a controlled experimental environment for analyzing behavioral and physiological responses.

The Wizard-of-Oz (WoZ) technique was implemented to facilitate precise control over the Furhat robot’s behavioral responses during social interactions with participants. This involved human operators remotely controlling the robot’s facial expressions, speech prosody, and verbal responses in real-time. By utilizing WoZ, researchers could standardize the delivery of social stimuli – including instances of rejection or judgment – ensuring consistency across all participant interactions. This method allowed for the introduction of nuanced social cues, beyond pre-programmed responses, enabling a more ecologically valid assessment of participant reactions while maintaining experimental control over the independent variables.

Participant responses to the standardized social stimuli delivered by the Furhat robot were captured through multiple data streams, including behavioral observations and self-report measures. These responses were time-stamped and categorized based on the specific social cue presented – instances of rejection or judgment – enabling a detailed analysis of the temporal relationship between stimulus and reaction. The resulting dataset includes quantifiable metrics of anxiety, such as scores from the Liebowitz Social Anxiety Scale (LSAS) and the Social Interaction Anxiety Scale (SIAS), correlated with specific behavioral responses observed during the interaction. This rich dataset, collected from over 70 participants, allows for the statistical examination of how individual differences in personality, as measured by the Big Five Inventory (BFI-10), and co-occurring depressive symptoms, assessed via the Beck Depression Inventory (BDI), moderate the relationship between social cues and anxiety levels.

Alongside behavioral data collected during robotic interactions, participant personality traits were assessed utilizing the Big Five Inventory (BFI-10), a 10-item questionnaire measuring Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism. Concurrent depressive symptoms were evaluated using the Beck Depression Inventory (BDI), a 21-item self-report measure of depressive symptom severity. Data from over 70 participants was compiled to allow for correlational analyses between personality traits, depressive symptom levels, and responses to standardized social stimuli delivered by the robotic platform, enabling a more nuanced understanding of individual differences in social anxiety.

Data collection incorporated five established questionnaires to characterize the participant cohort and assess relevant psychological traits. The Liebowitz Social Anxiety Scale (LSAS) and the Social Interaction Anxiety Scale (SIAS) were used to measure social anxiety levels. Depressive symptoms were evaluated using the Beck Depression Inventory (BDI). The Positive and Negative Affect Schedule (PANAS) quantified participants’ current emotional states, and the Big Five Inventory-10 (BFI-10) assessed personality traits based on the five major dimensions: Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism. These questionnaires provided a comprehensive baseline dataset for correlating personality, affect, and anxiety with responses to robotic social stimuli.

Beyond Diagnosis: Towards Personalized Intervention

Recent investigations demonstrate a significant advancement in social anxiety assessment through the convergence of multimodal data and robotic platforms. Researchers are now capable of capturing a more complete picture of an individual’s anxious responses by simultaneously analyzing physiological signals – such as heart rate and skin conductance – alongside behavioral cues observed during interactions with robotic agents. This approach moves beyond the limitations of self-reported experiences, which can be subject to bias or inaccuracies, and offers a uniquely ecologically valid environment for gauging anxiety levels in simulated social scenarios. The integration of these diverse data streams not only improves the accuracy of assessments, but also provides a richer, more nuanced understanding of how social anxiety manifests in real-time, opening doors to more targeted and effective interventions.

Traditional assessments of social anxiety often rely heavily on self-report questionnaires, which are susceptible to biases like social desirability and subjective interpretation of internal states. This emerging methodology, however, offers a pathway toward more objective evaluation by directly observing and quantifying behavioral and physiological responses in realistic social interactions. Robotic platforms facilitate the creation of standardized, controlled social scenarios, allowing researchers to capture subtle cues – such as gaze aversion, speech rate, and physiological indicators like heart rate variability and skin conductance – that might go unnoticed in conventional settings. This shift from subjective accounts to quantifiable data enables a more nuanced understanding of how anxiety manifests in specific social contexts, revealing individual patterns and triggers that can inform more targeted and effective interventions. Consequently, clinicians gain access to a richer, more ecologically valid profile of each individual’s anxiety, moving beyond broad diagnoses to pinpoint specific challenges and tailor treatment accordingly.

This research establishes a foundation for interventions designed to address the unique characteristics of each individual’s social anxiety. Rather than relying on generalized treatment approaches, future therapies can be precisely tailored by considering a person’s physiological responses, behavioral patterns, and underlying personality traits as revealed through multimodal assessment. This personalized approach promises to move beyond symptom management, targeting the core mechanisms driving social anxiety for each person and, consequently, maximizing the potential for lasting positive change. By understanding the nuanced interplay of factors contributing to an individual’s experience, interventions can be developed to address specific social challenges and promote more effective coping strategies.

A comprehensive understanding of social anxiety necessitates moving beyond self-reported feelings and examining the interplay of internal states and outward behaviors. Research indicates that combining physiological signals – such as heart rate variability and skin conductance – with observed behavioral patterns and established personality traits offers a significantly richer profile of an individual’s anxiety experience. This integrated approach allows for the identification of subtle indicators of distress that might otherwise go unnoticed, enabling the development of targeted interventions. For instance, individuals exhibiting high physiological arousal coupled with avoidant behaviors and a tendency towards negative self-evaluation may benefit from interventions focusing on emotional regulation and cognitive restructuring, while those with different profiles could require alternative strategies. Ultimately, this data-driven methodology promises more effective and personalized treatment plans, maximizing the potential for mitigating the debilitating effects of social anxiety.

The pursuit of an affect-adaptive human-robot interaction, as detailed in the protocol, necessitates a deep understanding of the nuances of human emotional response. It’s a process akin to dissecting a complex system to reveal its inner workings. Ada Lovelace observed, “The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform.” This sentiment resonates strongly with the current research; the robot, like the Analytical Engine, requires precise instruction-in this case, an understanding of social anxiety cues-to respond appropriately. The collected multimodal dataset serves as the ‘program’ to teach the robot, allowing it to eventually recognize and adapt to subtle shifts in human affect during interaction, effectively reverse-engineering the complexities of social anxiety itself.

What’s Next?

The construction of this multimodal dataset isn’t an endpoint, but a controlled demolition of assumptions. The field of affective computing often operates on the premise of readily-decipherable emotional signals. This work implicitly acknowledges the messiness – the possibility that anxiety, as expressed in human-robot interaction, isn’t a neat set of biomarkers, but a complex, context-dependent system. The true value lies not in finding the code for social anxiety, but in refining the questions asked of it.

Future iterations should deliberately introduce noise – variations in robot behavior that aren’t strictly ‘adaptive’, but exploratory. The goal shouldn’t be to create robots that perfectly mimic empathetic responses, but to map the boundaries of human tolerance – to discover where the illusion breaks down, and what that reveals about the underlying mechanisms of social interaction. Consider, for example, systematically introducing robotic ‘errors’ in emotional recognition – a deliberately misread cue, a delayed response – and observing the human subject’s reaction.

Ultimately, this isn’t about building better robots. It’s about reverse-engineering the human operating system. Reality is open source – the code is there, distributed across billions of interactions. This dataset is simply a more rigorous attempt to read it, acknowledging that every line of code discovered will inevitably reveal more questions than answers.

Original article: https://arxiv.org/pdf/2511.13530.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

Deconstructing the Anxious Spectrum

Unveiling Signals: A Multimodal Approach

The Robotic Mirror: Simulating Social Challenges

Beyond Diagnosis: Towards Personalized Intervention

What’s Next?

See also: