The Sketch & The System: How Designers Collaborate with AI

Author: Denis Avetisyan

New research reveals how designers dynamically shift roles with AI during the earliest stages of creative design, moving fluidly between leading, being led, and co-creating.

Early-stage product concepts are iteratively refined through a system allowing users to sketch initial ideas-such as a toaster-and then engage in text- and image-based dialogue with an integrated AI, fostering exploration and development beyond the initial draft.

This study presents a taxonomy of human-AI interaction modes during sketch-based design ideation with multimodal large language models, emphasizing the need for flexible collaboration paradigms.

While artificial intelligence promises to augment creative processes, understanding how designers actually collaborate with these tools remains a crucial open question. This research-A Taxonomy of Human–MLLM Interaction in Early-Stage Sketch-Based Design Ideation-investigates this dynamic through a user study, revealing that designers fluidly shift between leading, following, or co-evolving with multimodal large language models during early-stage ideation. These findings demonstrate that successful human-AI collaboration isn’t about fixed roles, but rather a flexible interplay of agency. How can interactive systems be designed to better support-and even anticipate-these shifting patterns of creative responsibility?

The Evolving Sketch: Bridging Intuition and Iteration

The initial phases of design ideation are frequently characterized by a reliance on sketching as a swift means of visually exploring concepts; however, this process often remains a largely individual pursuit. Designers commonly begin by rapidly generating numerous sketches to capture a wide range of potential solutions, but this can inadvertently limit the diversity of ideas considered. While sketching allows for quick visualization and iteration, it traditionally lacks the inherent collaborative elements found in later design stages. This solitary nature can lead to a narrowing of perspectives, as designers may become fixated on their own initial concepts without benefiting from the input and critique of peers, potentially hindering the development of truly innovative solutions. Consequently, bridging this gap between rapid individual sketching and broader collaborative feedback represents a key challenge in maximizing the effectiveness of early-stage design exploration.

The core of effective design ideation lies in repeated refinement, and sketching, as a primary method, necessitates strategies for navigating this iterative process. Designers frequently encounter moments where initial concepts stall, requiring techniques to break through creative blocks and generate novel alternatives. Efficient exploration isn’t simply about producing more sketches, but about intelligently varying existing ideas – tweaking parameters, combining disparate elements, and rapidly prototyping different solutions. This demands methods that go beyond pure visual thinking, potentially incorporating randomness, constraint-based generation, or computational tools to systematically explore the design space and prevent stagnation, ultimately leading to a richer and more diverse set of possibilities.

Historically, the design process has been fundamentally tactile, relying on a designer’s hand to quickly visualize and iterate on concepts through sketching. However, this traditional workflow presents limitations in fully exploring the potential of those initial ideas; computational tools haven’t been seamlessly integrated to augment creativity. While sketching excels at rapid ideation, refining and evolving designs often requires time-consuming manual adjustments and can hinder the exploration of complex variations. Recent advancements seek to bridge this gap by leveraging computational power – through techniques like generative algorithms and interactive digital interfaces – to not only digitize sketches but also to intelligently suggest refinements, explore design permutations, and ultimately accelerate the transition from initial concept to a fully realized design.

Analysis of system logs reveals that participants engage in ideation through varying interaction modes-including human-only, human-lead, AI-lead, and co-evolution-characterized by sequences of user input ([latex]e.g.[/latex], sketch, text prompt) and AI output ([latex]e.g.[/latex], text, image generation) as shown in the proportional distribution for each participant.

SketchLLM: Augmenting, Not Replacing, the Creative Hand

SketchLLM is a research prototype developed to integrate human creative input with the analytical strengths of artificial intelligence during the initial stages of design. The system is not intended as a finished product, but rather as a platform for investigating how AI can augment, rather than replace, the designer’s intuitive process. It aims to address the limitations of both wholly manual and fully automated design workflows by providing a conversational interface where users can sketch ideas and receive AI-driven suggestions and variations, effectively combining qualitative and quantitative approaches to ideation. This is achieved through the interplay between user-generated sketches and AI responses, allowing for iterative refinement and exploration of design concepts.

SketchLLM facilitates conversational idea development by integrating a digital sketching module with a multimodal AI chatbot based on the Gemini 2.0 Flash model. This combination allows users to input sketches which are then interpreted by the AI, enabling a back-and-forth dialogue where the chatbot can respond to visual input and generate related textual suggestions or image variations. The sketching module functions as a primary input method, while Gemini 2.0 Flash provides the natural language processing and image generation capabilities necessary to sustain a conversational design process, allowing iterative refinement of ideas through combined visual and textual interaction.

SketchLLM utilizes Multimodal Large Language Models (MLLMs) to facilitate interactive design ideation. These models process both sketched visual input and textual prompts, enabling the system to understand the designer’s intent from freehand drawings. Following sketch interpretation, the MLLM generates image variations based on the initial input, allowing for rapid exploration of different design options. Concurrently, the system provides textual suggestions – such as design alternatives, feature refinements, or relevant considerations – further augmenting the ideation process and offering designers a broadened range of possibilities directly within the sketching interface.

Decoding the Dialogue: Patterns of Human-AI Collaboration

System logs from the SketchLLM platform recorded complete sequences of user interactions, detailing every input from the designer and corresponding output from the AI. These logs captured not only the content of each interaction – sketches, text prompts, and AI-generated responses – but also the timing, order, and specific parameters used in each step. The granularity of these records allowed for the reconstruction of complete design sessions, enabling analysis of how control dynamically shifted between the human designer and the AI throughout the iterative design process. Data points included the specific tools used, the level of detail in prompts, and the acceptance or rejection of AI suggestions, providing a comprehensive dataset for understanding human-AI collaboration patterns.

Analysis of SketchLLM system logs revealed four distinct human-AI interaction modes characterized by varying degrees of creative control. In Human-Lead interactions, the designer initiated and directed the majority of design decisions, with the AI responding to specific prompts. Conversely, AI-Lead interactions saw the AI proactively suggesting design elements and the designer primarily reacting to those proposals. Co-Evolution represented a more balanced exchange, with iterative refinement occurring through continuous contributions from both the designer and the AI. Finally, Human-Only interactions involved the designer completing the design process without utilizing AI assistance. These modes demonstrate a dynamic distribution of creative responsibility, reflecting the fluid nature of collaboration within the SketchLLM environment.

Analysis of system logs from SketchLLM indicates that the predominant interaction mode was Co-Evolution, representing 34.8% of all interactions. This was closely followed by AI-Lead interactions at 30.6% and Human-Lead interactions at 28.5%. The relatively low frequency of Human-Only interactions, comprising only 6.2% of the total, suggests that designers frequently engaged the AI as a collaborative partner throughout the ideation process. These proportions demonstrate a dynamic distribution of creative control, with responsibility shifting between the human designer and the AI model during the design workflow.

The Future of Design: A Symbiotic Relationship

The evolution of design tools is increasingly defined by a shift towards flexible interaction modes, allowing practitioners to fluidly alternate between direct manual control and AI-assisted generation. This capability moves beyond simple automation; it fosters a collaborative dynamic where designers retain creative agency while leveraging the speed and exploratory potential of artificial intelligence. Such systems aren’t intended to replace skilled artistry, but rather to amplify it, enabling rapid prototyping and the exploration of a wider design space than previously possible. This seamless transition-from meticulously crafting individual elements to prompting AI for variations or entirely new concepts-represents a fundamental advancement, promising tools that adapt to the designer’s intent and augment the creative workflow with unprecedented versatility.

SketchLLM’s capabilities highlight a crucial shift in the role of Generative AI within the design process – it is not intended as a replacement for human creativity, but rather as a powerful extension of it. The system facilitates rapid ideation by allowing designers to quickly explore a multitude of variations on initial sketches, effectively acting as a collaborative partner that can translate ambiguous concepts into concrete visual forms. This augmentation is achieved through a nuanced interaction where designers retain full control over the overall direction, while the AI handles the computationally intensive tasks of refinement and exploration, ultimately accelerating the creation of innovative designs and freeing designers to focus on higher-level conceptual thinking and artistic expression. The tool’s success lies in its ability to seamlessly blend human intention with AI-driven generation, fostering a synergistic relationship that unlocks new creative possibilities.

Detailed observation of how designers fluidly switch between directing SketchLLM and assuming full control reveals crucial insights for building future AI-assisted design tools. The study highlights a need for interfaces that not only support diverse interaction styles but also clearly communicate the level of AI involvement – ensuring designers remain aware of when the system is generating content versus simply executing commands. This understanding informs the development of tools prioritizing intuitive control, allowing designers to effortlessly blend their creative intent with AI’s generative capabilities, and fostering a collaborative dynamic where the technology amplifies, rather than dictates, the design process. Ultimately, successful integration hinges on building tools that feel like an extension of the designer’s skillset, providing power without sacrificing creative agency.

The study illuminates a dynamic interplay, where designers aren’t simply directing tools, but rather engaging in a fluid dance with AI, oscillating between leadership and collaborative co-evolution. This echoes Alan Turing’s sentiment: “It seems probable that once the machine thinks at all, it will think differently from humans.” The research demonstrates this difference isn’t a barrier, but a potential strength; the shifting interaction modes – Human-Lead, AI-Lead, and Co-Evolution – suggest a system adapting to the needs of the design process, not dictating it. Architecture without history, or in this case, interaction, is indeed fragile; the successful ideation relies on acknowledging and adapting to the evolving relationship between human and machine.

What Lies Ahead?

The observed fluidity in human-MLLM interaction-the shifting between lead roles and co-evolution-suggests that the pursuit of a singular ‘ideal’ collaboration paradigm is fundamentally misguided. Systems, like organisms, learn to age gracefully; forcing a rigid structure onto this dynamic risks premature decay. The research demonstrates not how to make these systems collaborative, but how they already are when afforded the space to be. The challenge, then, isn’t optimization, but understanding the contours of this natural variance.

A critical limitation remains the difficulty in isolating the impact of the MLLM from the designer’s inherent creative process. Untangling causality is a perennial struggle, but focusing solely on quantifiable ‘outputs’ misses the point. Sometimes observing the process-the subtle negotiations of influence, the moments of unexpected synergy-is better than trying to speed it up. Future work should prioritize longitudinal studies, mapping these interaction modes over extended design sessions, and investigating the long-term effects on creative responsibility.

Ultimately, the field faces a choice. It can pursue ever more sophisticated algorithms designed to ‘solve’ design problems, or it can focus on building systems that respond to the messy, unpredictable reality of human creativity. The former promises incremental gains; the latter, a deeper understanding of the co-evolutionary dance at the heart of design itself.

Original article: https://arxiv.org/pdf/2602.22171.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Evolving Sketch: Bridging Intuition and Iteration

SketchLLM: Augmenting, Not Replacing, the Creative Hand

Decoding the Dialogue: Patterns of Human-AI Collaboration

The Future of Design: A Symbiotic Relationship

What Lies Ahead?

See also: