Building with AI: A New Framework for Collaborative Research

Author: Denis Avetisyan

Researchers are increasingly leveraging artificial intelligence to accelerate discovery, but a systematic approach to integrating AI into the research process is often lacking.

SHAPR integrates human-centered decision-making with AI-assisted development to produce practice-based artifacts that contribute to structured knowledge, emphasizing human accountability throughout the research process.

This paper introduces the SHAPR framework-a structured methodology for human-AI collaborative research, emphasizing knowledge capture, reproducibility, and iterative development through Action Design Research.

Despite growing interest in human-AI collaboration, a lack of structured methodologies hinders rigorous and reproducible research in AI-assisted environments. This paper introduces ‘SHAPR: Operationalising Human-AI Collaborative Research Through Structured Knowledge Generation’, a framework designed to integrate human-centred decision-making with AI capabilities through iterative cycles of exploration, development, and evaluation. Central to SHAPR is the concept of Structured Knowledge Units (SKUs), modular representations of research insights that facilitate knowledge accumulation and transparency. Could this approach not pave the way for scalable, knowledge-centred research practices and a new era of systematic refinement in AI-assisted discovery?

The Fragility of Progress: Reproducibility in AI Research

A significant impediment to validating artificial intelligence research lies within the practices of software development itself. Historically, academic and research code has not prioritized the meticulous version control and detailed documentation common in professional software engineering. This often results in a fragmented history of changes, making it difficult to pinpoint the exact conditions under which specific results were achieved. Without robust traceability – the ability to follow a result back to its originating code, data, and computational environment – independent verification becomes a substantial challenge. Consequently, replicating published findings can be surprisingly difficult, even for experts in the field, contributing to concerns about the reliability and cumulative progress of AI research. The lack of standardized practices effectively creates a barrier to building upon prior work, slowing innovation and potentially leading to wasted resources as researchers inadvertently revisit already explored paths.

A growing concern within the artificial intelligence community centers on a burgeoning ‘credibility crisis’, stemming from difficulties in validating published findings and extending existing research. The complex interplay of intricate algorithms, massive datasets, and nuanced experimental setups often lacks sufficient documentation for independent verification. This isn’t simply about replicating a single result; it’s about ensuring the foundations of AI research are robust and trustworthy. Without transparent methodologies and accessible resources, researchers struggle to confidently build upon prior innovations, leading to wasted effort, duplicated studies, and a slowing of progress. The inability to reliably verify and extend previous work undermines the scientific process and raises questions about the overall reliability of AI’s rapidly evolving landscape.

The advancement of artificial intelligence is increasingly hampered not by a lack of novel algorithms, but by the difficulty in transferring the nuanced, often unwritten, expertise – known as tacit knowledge – from researchers to replicable systems. Current research practices frequently prioritize the publication of final results, overlooking the iterative experimentation, hyperparameter tuning, and data preprocessing steps that are critical for achieving those results. This creates a significant barrier to verification, as simply possessing the code and data is often insufficient; the reasoning behind specific choices, the troubleshooting performed, and the insights gained during the development process remain largely undocumented. Consequently, even with access to the stated methods, independent researchers struggle to recreate published findings, limiting the ability to build upon prior work and accelerating the pace of innovation; the true value of research resides not just in what is known, but in how it is known, and that knowledge is frequently lost in translation.

SHAPR integrates human-centered decision-making with AI-assisted development to produce practice-based artifacts and structured knowledge, while maintaining human accountability throughout the research process.

SHAPR: A Framework for Knowledge-Driven AI Advancement

The SHAPR framework builds upon Action Design Research (ADR) by providing a formalized methodology for incorporating generative AI into the software development lifecycle. While ADR traditionally focuses on understanding and influencing human behavior through iterative design, SHAPR extends this approach to encompass the collaborative dynamic between researchers and AI agents. This expansion necessitates a structured process to manage the integration of AI-driven tools, ensuring research insights are systematically captured and leveraged. SHAPR formalizes this through the SHAPR Cycle and the resulting creation of SHAPR Knowledge Units (SKUs), allowing for repeatable, auditable, and knowledge-driven AI assistance in software development tasks.

Human-AI collaboration within the SHAPR framework centers on utilizing generative AI tools to extend the capabilities of researchers throughout the software development lifecycle. These tools are not intended to replace human researchers, but rather to augment their abilities in areas such as code generation, test case creation, documentation, and analysis of research data. Specifically, AI assists in automating repetitive tasks, accelerating the exploration of design spaces, and providing insights from large datasets, allowing researchers to focus on higher-level problem-solving and validation of results. This collaborative approach aims to improve both the efficiency and the quality of AI-assisted software development by combining human expertise with the computational power of generative AI.

The SHAPR Cycle is an iterative process designed to facilitate continuous improvement in AI-assisted software development. It consists of five sequential phases: Explore, where research questions are formulated and initial data is gathered; Build, focused on developing AI-supported artifacts or prototypes; Use, involving the application of these artifacts in real-world contexts; Evaluate, where the effectiveness of the artifacts is assessed through data collection and analysis; and Learn, which centers on refining the process and incorporating insights gained from evaluation into subsequent iterations. Completion of the Learn phase returns the cycle to the Explore phase, establishing a continuous loop of refinement and knowledge accumulation.

SHAPR Knowledge Units (SKUs) are formalized, reusable components generated throughout the SHAPR Cycle. These SKUs encapsulate research insights derived from each stage – Explore, Build, Use, and Evaluate – and are structured to include a problem statement, proposed solution, implementation details, usage context, and evaluation metrics. The consistent format allows for efficient storage, retrieval, and comparison of different approaches, facilitating knowledge sharing and preventing redundant effort. SKUs are not limited to code; they can also include qualitative data, design rationales, and identified limitations, offering a holistic representation of research findings for both human researchers and AI agents within the framework.

SHAPR employs an iterative cycle of building, intervening, and evaluating-aligned with Action Design Research-to continuously refine AI-assisted development through the interplay of human decision-making and AI support.

Building a Reproducible Ecosystem: SHAPR’s Infrastructure

The SHAPR Repository Workspace employs Version Control Systems (VCS) to maintain a complete audit trail of all research artefacts. These systems, including but not limited to Git, record modifications to files over time, allowing for the reconstruction of any previous state. Each artefact – encompassing data, code, models, and documentation – is subject to versioning, facilitating reproducibility and collaborative development. VCS functionality includes branching and merging, enabling parallel exploration of research directions without compromising the integrity of the primary codebase. Furthermore, the system supports attribution of changes to specific researchers, ensuring accountability and facilitating peer review of the research process.

The SHAPR Interaction Workspace is a purpose-built environment designed to facilitate researcher exploration and AI collaboration. This workspace provides tools for iterative experimentation, allowing researchers to test hypotheses and refine approaches with integrated AI assistance. Functionality includes direct access to LLMs for real-time feedback and analysis, as well as features supporting the generation and evaluation of research outputs. The environment is structured to enable a dynamic interplay between human insight and AI capabilities, accelerating the research process and fostering novel discoveries. Data generated within the Interaction Workspace is version controlled and can be seamlessly integrated into the broader SHAPR ecosystem.

LLM Workspaces within SHAPR provide integrated access to Large Language Models (LLMs) to facilitate multiple stages of the research process. These models support reasoning tasks by enabling researchers to explore hypotheses and analyze data. Prompt engineering is directly supported, allowing for iterative refinement of LLM inputs to achieve desired outputs. Furthermore, LLMs are utilized to automatically generate and maintain documentation related to experiments, datasets, and methodologies, ensuring comprehensive record-keeping and reproducibility throughout the SHAPR Cycle. This integration streamlines workflows and enhances the overall efficiency of research activities.

Artefact Evolution within SHAPR is implemented through a system of iterative refinement and persistent storage of all generated outputs. Each iteration of the SHAPR Cycle doesn’t overwrite previous work; instead, new artefacts are explicitly linked to their predecessors, creating a traceable lineage. This ensures that all intermediate results, including models, prompts, datasets, and analyses, are preserved and readily accessible. The accumulated artefacts form a continuously expanding knowledge base, facilitating reproducibility, enabling comparative analysis of different approaches, and providing a foundation for future research and development. This approach contrasts with traditional workflows where only final results are typically retained, leading to a loss of valuable information regarding the research process.

SHAPR facilitates an iterative workflow-Explore-Build-Use-Evaluate-Learn-that evolves artifacts and systematically stores resulting code, cycle records, and structured knowledge units (SKUs) in a repository for enhanced traceability and knowledge accumulation.

Towards AI-Executable Research: The Future of Knowledge Capture

The SHAPR framework is designed to move beyond simply presenting research findings to actively enabling AI systems to execute research protocols. This is achieved by structuring knowledge in a way that machines can interpret, allowing them to automate tasks ranging from data collection and analysis to hypothesis testing and even experimental design. Instead of requiring researchers to manually translate findings into code or algorithms, SHAPR provides a standardized, machine-readable format for capturing the core logic of a study. This capability promises to significantly accelerate the research lifecycle, enabling rapid iteration and exploration of complex scientific questions, and ultimately fostering a collaborative relationship between human researchers and artificial intelligence in the pursuit of discovery.

The automated synthesis of insights and generation of novel hypotheses is now increasingly achievable through knowledge extraction techniques, particularly those leveraging Structured Knowledge Units (SKUs). These SKUs function as discrete, standardized representations of research findings, allowing artificial intelligence systems to not merely access information, but to actively process and connect it in meaningful ways. By decomposing complex research into these granular units, algorithms can identify patterns, contradictions, and gaps in existing knowledge that might elude human observation. This capability moves beyond simple data aggregation, enabling the AI to formulate testable predictions and propose entirely new research directions, effectively accelerating the cycle of scientific discovery and potentially unlocking breakthroughs across diverse fields.

The Systematic Harmonization of Research Processes (SHAPR) dramatically accelerates innovation by moving beyond simple data storage to a structured capture of research knowledge itself. This isn’t merely digitizing papers, but meticulously representing the underlying logic – the methods, assumptions, and relationships between findings – in a format readily accessible to both humans and machines. By explicitly defining the components of a study and their interconnections, SHAPR enables automated analysis, validation, and extension of existing research. This allows for rapid synthesis of information across disparate fields, identification of knowledge gaps, and ultimately, the generation of novel hypotheses with increased efficiency. The framework therefore doesn’t just speed up individual experiments, but fosters a continuous cycle of learning and discovery, significantly compressing the timeline from initial inquiry to impactful innovation.

The envisioned future of scientific inquiry, powered by frameworks like SHAPR, transcends simple automation; it anticipates a collaborative synergy between artificial intelligence and human researchers. Rather than merely processing data, these systems are designed to actively participate in the research cycle – formulating hypotheses, designing experiments, analyzing results, and even identifying previously unseen connections within complex datasets. This isn’t about replacing scientists, but rather augmenting their capabilities, allowing them to focus on the most creative and strategic aspects of discovery while AI handles the computationally intensive and time-consuming tasks. Such a paradigm shift promises not only an acceleration of the scientific process but also the potential to unlock insights that might remain hidden through traditional methods, ultimately driving a new era of innovation and expanding the boundaries of human knowledge.

The SHAPR operational model integrates human decision-making with AI-assisted development to iteratively evolve research artefacts, generate structured knowledge, and establish design principles, all underpinned by evidence and traceability for transparency and refinement.

The SHAPR framework, as detailed in the research, prioritizes a systematic approach to knowledge generation, mirroring a commitment to minimizing unnecessary complexity. This echoes Barbara Liskov’s sentiment: “Programs must be correct and understandable.” The framework’s emphasis on meticulously documenting each stage of AI-assisted development-from initial Action Design Research through iterative refinement-directly addresses the need for understandable and reproducible results. By structuring research around discrete Knowledge Units and prioritizing human-centered decision-making, SHAPR actively combats the potential for opacity that can plague AI-driven projects. The pursuit of clarity in process directly yields clarity in outcome, aligning with a core principle of efficient and reliable software development.

Further Refinements

The SHAPR framework, while addressing critical needs for rigor in human-AI collaborative research, does not dissolve the fundamental problem. Traceability, even when meticulously maintained, reveals only how decisions were made, not whether those decisions were, in retrospect, optimal. The framework offers a map of the process, but not a guarantee of destination. Future work must address methods for evaluating the quality of knowledge units themselves – a taxonomy of flawed knowledge, if you will – and mechanisms for automated refinement.

Current iterations emphasize iterative development, a virtue. Yet, iteration, without clear termination criteria, risks infinite loops. The field requires formalization of ‘sufficient’ knowledge – a pragmatic threshold beyond which further refinement yields diminishing returns. This is not merely a technical challenge, but a philosophical one. What constitutes ‘knowing enough’ in a domain increasingly mediated by opaque algorithms?

Ultimately, the value of SHAPR – or any such framework – resides not in its complexity, but in its capacity for subtraction. The goal is not to capture everything, but to identify and discard that which is irrelevant, inaccurate, or actively misleading. Clarity is, after all, the minimum viable kindness.

Original article: https://arxiv.org/pdf/2603.25660.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Fragility of Progress: Reproducibility in AI Research

SHAPR: A Framework for Knowledge-Driven AI Advancement

Building a Reproducible Ecosystem: SHAPR’s Infrastructure

Towards AI-Executable Research: The Future of Knowledge Capture

Further Refinements

See also: