Beyond Correlation: Modeling Robot Behavior to Shape Human Perception

Author: Denis Avetisyan


New research demonstrates a causal model that predicts how people perceive robots during navigation, enabling the design of more effective and trustworthy interactions.

A causal model predicts a robot’s perceived competence by analyzing environmental cues along its trajectory, and-when low competence is anticipated-identifies the minimal behavioral adjustments expected to yield improved performance, acknowledging that systems evolve through iterative adaptation rather than direct construction.
A causal model predicts a robot’s perceived competence by analyzing environmental cues along its trajectory, and-when low competence is anticipated-identifies the minimal behavioral adjustments expected to yield improved performance, acknowledging that systems evolve through iterative adaptation rather than direct construction.

This work introduces a causal Bayesian network approach to predict and improve human perceptions of robot competence in social navigation scenarios.

Predicting how humans perceive robots remains a challenge as deployments in shared spaces increase, largely due to the need for both accurate predictions from limited data and interpretable models for safe interaction. This paper, ‘A Causal Approach to Predicting and Improving Human Perceptions of Social Navigation Robots’, addresses these limitations by introducing a Causal Bayesian Network to model human perception of robot competence and intent during navigation. The proposed model not only predicts perceptions with comparable or superior performance-reaching an F1-score of 0.78 for competence-but also generates improved robot behaviors, demonstrably increasing perceived competence by 83% in user evaluations. Could this causal approach unlock more intuitive and trustworthy human-robot collaborations in complex, real-world environments?


The Illusion of Understanding: Perceiving Robotic Intent

Effective collaboration between humans and robots during navigation isn’t solely about reaching a destination; it fundamentally relies on how a robot’s movements are perceived. Research demonstrates that humans rapidly assess a robot’s competence – its ability to navigate efficiently and avoid obstacles – alongside its perceived intentionality, or the believability of its chosen path. A robot that appears both capable and purposeful fosters trust and encourages seamless teamwork. Conversely, even technically flawless navigation can be misinterpreted if the robot’s actions seem erratic or lack a clear goal, leading to decreased acceptance and potentially hindering the collaborative process. This highlights that successful human-robot interaction necessitates designing robotic behaviors that are not just efficient, but also intuitively understandable and demonstrably goal-oriented.

Initial observations of a robot’s path significantly shape how humans assess its competence and anticipate its future actions. Research indicates that individuals rapidly construct internal models of a robot’s navigational intent based solely on the observed trajectory – even subtle deviations from expected routes can trigger immediate judgments about its capabilities. This swift evaluation directly influences the level of trust a human extends to the robot, and consequently, their willingness to collaborate effectively. A perceived lack of efficiency, even if functionally correct, can erode confidence, while smooth, purposeful movement fosters a sense of reliability and encourages seamless human-robot teamwork. Consequently, designing robot navigation that prioritizes not just optimal pathfinding, but also perceived intentionality, is crucial for fostering positive and productive interactions.

Even flawlessly executed navigation can elicit negative responses if a robot’s actions are misconstrued by human observers. Research indicates that people don’t simply assess what a robot does, but also why it appears to be doing it; a direct, efficient path, for example, might be perceived as aggressive or uncaring if it lacks any visible attempt to acknowledge pedestrian traffic or environmental context. This misinterpretation stems from the human tendency to attribute intentions and mental states to other agents, and when a robot’s behavior doesn’t align with expected social cues, trust erodes. Consequently, even technically correct navigation – one that avoids obstacles and reaches the destination – can hinder effective human-robot interaction if it fails to convey appropriate intentionality, leading to discomfort, reduced collaboration, and ultimately, rejection of the robotic system.

Our user study (Sec. VI) leveraged navigation videos-showing a robot (blue arrow) and follower (red arrow) moving towards a green goal within a 7.2m space-to compare original, low-competence trajectories with counterfactual behaviors designed to improve navigation.
Our user study (Sec. VI) leveraged navigation videos-showing a robot (blue arrow) and follower (red arrow) moving towards a green goal within a 7.2m space-to compare original, low-competence trajectories with counterfactual behaviors designed to improve navigation.

Constructing Alternatives: The Mirage of Choice

Counterfactual Behavior Generation involves computationally constructing alternative robot trajectories representing actions the robot could have taken, given the same initial conditions and environmental context. This is achieved by simulating deviations from the executed trajectory, effectively creating a set of “what if” scenarios. These generated trajectories are not intended for real-time execution; rather, they serve as a means to provide contextual information and address potential ambiguities in interpreting the robot’s actual behavior. The system does not modify the primary trajectory but instead produces these alternatives for analysis and potential presentation to observers, enabling a broader understanding of the robot’s behavioral options and underlying intent.

Counterfactual trajectory generation utilizes a Breadth-First Search (BFS) algorithm to systematically explore behavioral alternatives. Starting from the executed robot trajectory as the root node, BFS expands the search space by iteratively generating neighboring trajectories representing slight deviations in action. Each level of the search represents trajectories achievable with a fixed number of action differences from the original. This process creates a tree-like structure of possible trajectories, allowing for the identification of alternative behaviors that, while not executed, represent plausible deviations and contribute to a broader understanding of the robot’s behavioral space. The algorithm prioritizes exploring trajectories with minimal deviation before expanding to more significant alterations, ensuring efficient coverage of the behavioral landscape.

The generation of counterfactual trajectories is not intended to modify the robot’s executed path. Instead, this technique focuses on providing observable alternatives to the chosen behavior. These demonstrated options serve to communicate the robot’s understanding of the task and its capacity to perform it in multiple ways, thereby influencing human perception of its competence and intentionality. By presenting viable alternatives, the system aims to preemptively address potential misinterpretations regarding the robot’s planning or execution, reinforcing a positive assessment of its capabilities without altering the actual performed action.

Proactive behavior shaping addresses potential negative perceptions of a robot’s actions by generating demonstrative trajectories before failures occur. This approach doesn’t alter the robot’s primary course of action; instead, it showcases alternative, viable movements that could have been executed. By visibly presenting these options – such as a smoother path around an obstacle or a more deliberate grasp – the system preemptively communicates competence and intentionality. This anticipatory demonstration mitigates potential misinterpretations of clumsy or inefficient movements, effectively functioning as a failure prevention mechanism by shaping observer expectations and reducing the likelihood of negative assessment.

This Conditional Bottleneck Network (CBN) graph facilitates robot following by selectively propagating relevant information through bottleneck vectors.
This Conditional Bottleneck Network (CBN) graph facilitates robot following by selectively propagating relevant information through bottleneck vectors.

Mapping Perception: A Network of Assumptions

A Causal Bayesian Network (CBN) was implemented to explicitly model the probabilistic dependencies between robot actions and resulting human perceptions. Specifically, the network defines relationships between quantifiable robot behaviors – including Robot Position Change, Initial Robot Rotation, and Total Robot Rotation – and the subjective human assessments of Perceived Competence and Perceived Intention. This approach allows for the representation of causal influences; for example, a particular change in robot position may increase the probability of a higher competence score. The CBN framework facilitates the prediction of human perceptions given specific robot trajectories, and conversely, enables the identification of robot behaviors likely to elicit desired perceptual responses.

The Causal Bayesian Network utilized in this work is trained on the SEAN Together Dataset, a collection of human-robot interaction data comprising over 1,500 unique trajectories and over 8,000 individual human ratings. This dataset captures interactions in a collaborative task environment, featuring human participants instructing a robot to move objects to specified target locations. The SEAN Together Dataset’s scale and focus on realistic human-robot collaboration ensures the trained network accurately reflects human perceptions in comparable interactive scenarios, providing a robust foundation for predicting perceived competence and intention based on robot behavior.

The Causal Bayesian Network utilized in this work requires discrete inputs; however, robot state variables such as position and rotation are inherently continuous. To address this, we employ a discretization process facilitated by Time-Series Clustering. This technique groups similar time-series data points, effectively binning the continuous variables into a finite number of discrete states. Specifically, Time-Series Clustering identifies representative states based on the robot’s movement patterns, allowing for the transformation of continuous values into categorical inputs suitable for the Bayesian Network’s probabilistic reasoning. This discretization is a necessary preprocessing step, enabling the model to learn relationships between robot actions and human perceptions despite the continuous nature of the robot’s state space.

Evaluation of the Causal Bayesian Network demonstrates performance gains compared to Random Forest baselines on the SEAN Together Dataset. Specifically, the network achieved an improvement of 0.047 in F1-Score for predicting Perceived Competence and 0.044 for Perceived Intention. Furthermore, the model exhibited increases in Accuracy of 0.021 for Competence and 0.029 for Intention, indicating a statistically significant enhancement in predictive capability across both measured perception dimensions.

The trained Causal Bayesian Network enables the prediction of human perceptions – specifically Perceived Competence and Perceived Intention – given a robot’s trajectory. This predictive capability is leveraged to generate effective counterfactuals by identifying minimal changes to a robot’s actions that would result in a desired perceptual outcome. By evaluating multiple possible trajectories through the network, we can determine which alterations to Robot Position Change, Initial Robot Rotation, and Total Robot Rotation are most likely to shift human assessment of the robot’s competence or intentionality. This contrasts with methods that rely on observing realized trajectories, allowing for proactive planning of robot behavior to optimize perceived qualities.

Cluster analysis reveals distinct groupings based on changes in robot position and rotation.
Cluster analysis reveals distinct groupings based on changes in robot position and rotation.

The Illusion of Control: Validating Perceived Agency

To rigorously assess the impact of artificially generated behavioral responses, an online user study was undertaken with a diverse participant pool. This study presented subjects with scenarios involving a virtual agent and variations in its demonstrated actions-specifically, counterfactual behaviors showcasing what could have been done differently. Participants then evaluated the agent’s perceived competence and intentionality across these varying conditions. The design enabled researchers to move beyond subjective impressions and establish quantifiable metrics for the effectiveness of different behavioral strategies, providing empirical evidence to support the potential for proactive demonstrations in enhancing human-agent interactions. Data collected from these evaluations formed the basis for statistical analysis, revealing significant correlations between specific counterfactuals and positive shifts in user perception.

User responses to the counterfactual trajectories were rigorously analyzed through a Linear Mixed-Effect Model, a statistical technique capable of discerning subtle effects while accounting for individual differences. This approach allowed researchers to isolate the impact of specific trajectory variations – how the robot could have moved – on two key perceptual measures: perceived competence and inferred intention. By treating individual participants as random effects, the model minimized bias and enhanced the generalizability of the findings. The resulting data revealed not only that counterfactuals influenced perception, but also the magnitude of that influence, providing a quantifiable metric for evaluating the effectiveness of different behavioral demonstrations in human-robot interaction.

User studies reveal that strategically generated counterfactual behaviors markedly enhance human perceptions of robotic competence. Analyses employing a Linear Mixed-Effect Model demonstrate a substantial 83% increase in perceived competence when a robot corrects a human’s misinterpretation through the demonstration of an alternative action. Interestingly, even when a human already understands the correct course of action, proactively showcasing it via counterfactual behavior still yields a significant 27% improvement in perceived competence. These findings suggest that simply demonstrating possible actions, regardless of immediate necessity, powerfully shapes human judgment and fosters a greater sense of trust in robotic collaborators.

The research indicates that demonstrating potential actions, rather than simply reacting to situations, significantly enhances human trust in robotic systems and improves collaborative efforts. By proactively showcasing possible behaviors, robots can preemptively address potential misunderstandings or reinforce correct interpretations, fostering a sense of predictability and competence. This approach moves beyond reactive responses, establishing a foundation of shared understanding and allowing humans to anticipate the robot’s actions. Consequently, the study suggests that this proactive demonstration isn’t merely about correcting errors, but about building confidence through transparent intention and capability, ultimately leading to more effective and harmonious human-robot teamwork.

The pursuit of predictable systems, as demonstrated by this exploration of causal Bayesian networks in human-robot interaction, feels perpetually shadowed by an inherent irony. Researchers attempt to map perceived competence, to engineer trust through predictable navigation, yet the very act of modeling introduces a fragility. As Edsger W. Dijkstra observed, “It’s not that we need more information, but that we need less.” The drive to quantify and control, to predict human response, often obscures the simple truth: systems evolve beyond their initial design. This work, while offering a more nuanced understanding of causal relationships in robot navigation, merely refines the compromise-freezing another moment in time before the inevitable entropy sets in. The focus on counterfactual reasoning, though insightful, is but a temporary bulwark against the unpredictable currents of real-world interaction.

What Lies Ahead?

This work, in its attempt to map perception onto action, reveals a fundamental truth: prediction is not control. The causal Bayesian network successfully anticipates human judgment, yet the very act of optimizing for that judgment introduces a brittleness. Scalability is simply the word used to justify complexity, and a model perfectly attuned to present expectations will inevitably falter when faced with novel situations. The field chases competence, but rarely considers the cost of that competence in terms of adaptability.

Future efforts will likely focus on dynamic causal models – systems that not only predict but learn the structure of human expectation itself. However, a more profound challenge lies in acknowledging that the ‘ideal’ robot – one that seamlessly integrates into human spaces – is a myth, a comforting fiction. Every architectural choice is a prophecy of future failure. The pursuit of human-like navigation may be less about achieving perfect prediction and more about designing for graceful degradation – systems that are understandably imperfect.

Ultimately, the question isn’t whether a robot can appear competent, but whether it can inspire trust even when it isn’t. Everything optimized will someday lose flexibility. The true metric of success may not be measured in navigational efficiency, but in the richness of the interaction itself – even, and perhaps especially, when things go awry.


Original article: https://arxiv.org/pdf/2603.11290.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2026-03-13 18:26