Building Bots with Stories: Democratizing Social Robot Design

Author: Denis Avetisyan

New research demonstrates how generative AI, guided by narrative-based programming, is empowering novice users to create complex behaviors for social robots.

Robo-Blocks employs a generative scaffolding approach to robotic programming, beginning with narrative creation-leveraging large language models to define robotic tasks-and progressing through goal generation, program construction via simulation, and culminating in physical deployment, thereby translating natural language descriptions into executable robotic behaviors.

This review examines the potential of generative scaffolding to simplify end-user programming of social robots, focusing on narrative-based design and the challenges of human-robot interaction.

Despite the promise of large language models to democratize robotics, simply generating code risks obscuring fundamental programming concepts and hindering skill development. This research, detailed in ‘Robo-Blocks: Generative Scaffolding in End-User Design and Programming of Social Robots’, explores how narrative-based generative scaffolding can support novice programmers in designing behaviors for social robots. Through the design and evaluation of Robo-Blocks, a block-based programming environment, we found that structured narratives connecting high-level ideas to executable actions shape end-user strategies and reveal emerging user personas. How can we best integrate such scaffolding to foster both creative expression and robust programming skills in the rapidly evolving landscape of social robotics?

The Challenge of Intuitive Robot Programming

The creation of truly interactive robots is often hampered by a fundamental challenge: the complexity of their programming. Traditionally, controlling a robot’s actions requires deep knowledge of robotics, software engineering, and often, specialized coding languages. This reliance on highly technical expertise effectively creates a significant barrier to entry, limiting innovation to a relatively small group of specialists. Consequently, designers, artists, or even individuals with valuable insights into human-robot interaction are often unable to directly translate their ideas into robotic behaviors. This restricts the potential for widespread creativity and hinders the development of robots that can seamlessly integrate into everyday social environments, demanding a shift towards more intuitive and accessible programming paradigms.

Crafting genuinely engaging social interactions for robots presents a considerable engineering challenge, demanding a cyclical process of iterative prototyping and remarkably precise behavioral control. Unlike programming for purely functional tasks, social robotics requires developers to anticipate and respond to a spectrum of human behaviors, necessitating constant refinement of robotic responses. Subtle cues – facial expressions, vocal intonation, body language – all contribute to successful communication, and replicating these nuances in a robotic system demands algorithms capable of dynamic adjustment. This isn’t simply about what a robot does, but how it does it; a slight miscalibration in timing or expressiveness can quickly disrupt the sense of social presence, highlighting the delicate balance required to create a truly believable and engaging interaction. The complexity lies not only in the technical implementation, but also in the subjective nature of social perception itself.

Roboticists consistently encounter the challenge of imbuing artificial agents with convincingly natural behaviors; current approaches frequently yield movements and interactions that, while technically functional, fall short of genuine expressiveness. This often stems from a reliance on pre-programmed sequences or rule-based systems that struggle to replicate the subtle nuances of human communication – the micro-expressions, variations in timing, and adaptive responses that characterize authentic social exchange. Consequently, even sophisticated robots can exhibit behaviors perceived as stilted, uncanny, or lacking in emotional intelligence, hindering effective human-robot collaboration and diminishing the potential for truly engaging social interaction. The difficulty lies not merely in replicating what humans do, but in mirroring how they do it – a level of behavioral fidelity that demands innovative approaches to motion planning, perception, and artificial intelligence.

Robo-Blocks enables users to collaboratively create narratives with an LLM agent by providing task descriptions, a chat interface for iterative refinement, milestone tracking, and robot capability details.

Robo-Blocks: A Narrative-First Design for Social Robotics

Robo-Blocks utilizes a four-phase workflow to facilitate the creation of social robot behaviors for users lacking programming expertise. This process begins with defining the desired robot interaction, followed by visually scripting that interaction through storyboarding. The storyboard is then converted into executable code via a block-based programming interface, abstracting away complex syntax. Finally, users test and refine the robot’s behavior. This structured approach aims to lower the barrier to entry for creating sophisticated robot interactions without requiring traditional coding skills.

Narrative-Based Programming, central to the Robo-Blocks workflow, prioritizes the definition of robot behavior through storyboarding before any coding takes place. This visual scripting method allows users to map out desired interactions as a sequence of events, effectively creating a behavioral script using visual elements rather than text-based code. The storyboard serves as a blueprint, detailing the robot’s actions and responses in specific situations, and establishes the logical flow of the interaction before translation into executable commands. This approach aims to reduce the cognitive load associated with traditional programming by focusing on what the robot should do, rather than how it should do it.

The translation of storyboards into executable robot behaviors within Robo-Blocks is facilitated by a block-based programming interface. This visual approach abstracts away traditional text-based coding, allowing users to assemble functional code by connecting graphical blocks representing actions, conditions, and data. A user study involving 14 participants was conducted to assess the usability and effectiveness of this interface, gathering quantitative data on task completion times and error rates, as well as qualitative data through user interviews regarding their perceptions of the system and its impact on their ability to create robot behaviors without prior programming expertise.

Participants in the Robo-Blocks study completed a sequence of narrative creation, goal generation, programming, and deployment phases-each with an introductory tutorial-and provided feedback through post-task and post-study surveys, along with semi-structured interviews.

Generative Scaffolding: Augmenting Design with AI Assistance

Robo-Blocks utilizes Generative Scaffolding, a technique wherein the system proactively offers programming suggestions and guidance directly within the user’s workflow. This real-time assistance is designed to reduce development friction by anticipating potential needs and providing code snippets, function recommendations, or parameter suggestions as the user types. The system doesn’t simply offer pre-defined templates; instead, it generates suggestions dynamically based on the current code context and the user’s ongoing actions, effectively acting as a continuously available, context-aware assistant during the programming process.

The Generative Scaffolding system in Robo-Blocks relies on LLM Prompt Engineering to facilitate interaction with the large language models (LLMs) that drive its functionality. This involves the careful design and refinement of text-based prompts sent to the LLM, specifying the desired task and providing necessary context. Effective prompt engineering is critical for eliciting accurate, relevant, and useful responses from the LLM, as the quality of the prompt directly impacts the quality of the generated suggestions and guidance. The process includes iterative testing and optimization of prompt structure, keywords, and parameters to maximize performance and minimize ambiguity in communication with the AI model.

Robo-Blocks’ adaptive scaffolding system analyzes user actions in real-time to provide relevant suggestions and assistance, minimizing the amount of information a designer must actively manage during programming. This context-aware approach reduces cognitive load by proactively offering code completions, identifying potential errors, and suggesting optimized solutions based on the current design state. Evaluation using the System Usability Scale (SUS) yielded a score of 73.85, which falls within the range indicating generally acceptable usability according to established benchmarks.

Robo-Blocks allows users to define tasks narratively, which are then automatically translated into programmable goals by an LLM agent.

From Design to Reality: Deploying Interactions on a Physical Platform

Robo-Blocks functions as a streamlined interface for translating programmed behaviors into actions on the Misty Robot, a widely adopted platform in social robotics. This robot’s inherent versatility – encompassing expressive movement, speech capabilities, and an array of integrated sensors – makes it an ideal testing ground for complex interactions. Robo-Blocks leverages this potential by providing a system through which developers and researchers can directly implement and evaluate their designed narratives, facilitating a rapid cycle of prototyping and refinement. The platform’s compatibility with the Misty Robot aims to bridge the gap between theoretical designs and tangible robotic behaviors, ultimately accelerating innovation in the field of human-robot interaction.

Effective deployment of robotic behaviors hinges on a critical process known as Capability Alignment. This ensures that the programmed actions don’t exceed the physical limitations of the robotic platform, preventing failed movements or unrealistic expectations during interaction. Researchers found that simply designing a compelling narrative isn’t enough; the intended behaviors must be carefully mapped onto the robot’s actual degrees of freedom, motor capabilities, and sensor ranges. Without this alignment, even a well-conceived interaction can appear clumsy or unresponsive, diminishing the user experience. Therefore, successful implementation requires a feedback loop where desired actions are constantly evaluated and adjusted to remain within the bounds of what the robot can reliably achieve, fostering a seamless and believable performance.

The system facilitates the creation of compelling robotic interactions through a process of continuous refinement, where both the narrative storyline and the underlying program are iteratively adjusted to maximize engagement while remaining within the physical limitations of the robot platform. This approach ensures interactions are not only imaginative but also reliably executable in the real world. User evaluations demonstrate a high degree of satisfaction with this iterative design process, as evidenced by a strong overall USE score – averaging 4.62 for usefulness, 5.14 for ease of use, 5.85 for ease of learning, and 5.03 for overall satisfaction – suggesting that the system is both effective and accessible for a broad range of users seeking to develop nuanced and dependable robotic behaviors.

Robo-Blocks provides a unified interface enabling users to program robots visually using drag-and-drop blocks (a, b), simulate and test their programs with integrated speech output (c, d), and deploy them to a physical robot (f) while referencing previously defined goals (e).

The research illuminates a fundamental principle of system design – that seemingly simple interventions can yield complex outcomes. This echoes John von Neumann’s observation: “There’s no describing the future.” The study’s exploration of generative scaffolding in robotic behavior demonstrates precisely this unpredictability. While narrative-based programming offers an accessible entry point for novice users, the emergent behaviors of the social robots, shaped by the generative AI, reveal a dynamic system where initial inputs do not guarantee predictable results. Understanding these systemic effects, and the boundaries within which they operate, is crucial for crafting robust and intuitive human-robot interactions, emphasizing that a holistic view of the system is paramount.

Future Constructs

The promise of generative scaffolding, as demonstrated by this work, rests not in automating design, but in shifting the locus of control. If the system survives on duct tape – patching prompts to elicit desired behaviors – it’s likely overengineered. The real challenge isn’t building more sophisticated large language models, but understanding how humans integrate these tools into their own creative processes. Current approaches often treat the robot’s behavior as the primary output, neglecting the user’s evolving mental model of the system itself. A truly elegant solution will prioritize transparency – allowing the user to interrogate the generative process, not merely accept its conclusions.

Modularity, frequently touted as a design principle, is an illusion of control without context. Simply providing pre-built blocks, even those generated dynamically, does not address the fundamental problem of compositional complexity. The research field must move beyond surface-level interactions, focusing instead on how narrative-based programming influences a user’s ability to reason about, predict, and debug robot behavior over extended periods.

Ultimately, the success of end-user robotics hinges on recognizing that the robot is not merely a machine to be programmed, but a medium for expression. The next iteration of this work should explore how generative scaffolding can support not just what the robot does, but how it communicates – its affect, its timing, its ability to forge a meaningful connection with a human partner. The scaffolding, if successful, will fade into the background, leaving only the emergent narrative.

Original article: https://arxiv.org/pdf/2605.28154.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

2026-05-28 15:07