DIY Autonomy: Building a Self-Driving Lab with Off-the-Shelf Parts

Author: Denis Avetisyan

Researchers demonstrate a low-cost, internet-connected platform for automated physics experiments, paving the way for hands-on machine learning education and autonomous discovery.

An affordable, closed-loop system iteratively refines voltage commands to an LED array-guided by traversal search, Bayesian optimization utilizing a probabilistic surrogate model [latex]\mu(x)[/latex], or a deep learning neural network-and compares the resulting discrete spectrum detected by a multichannel light sensor to a user-defined target, effectively creating a self-driving optical experiment.

This work details the construction of an affordable, IoT-based laboratory integrating machine learning algorithms for closed-loop control of optical spectroscopy, employing techniques like Bayesian optimization and deep learning.

Despite the growing prominence of machine learning in modern physics, hands-on educational experiences remain hindered by cost and complexity. This work, ‘Building an Affordable Self-Driving Lab: Practical Machine Learning Experiments for Physics Education Using Internet-of-Things’, introduces a low-cost, open-source Internet-of-Things platform enabling closed-loop control of optical experiments and facilitating practical training in foundational machine learning algorithms. We demonstrate the system’s efficacy through comparisons of traversal methods, Bayesian inference, and deep learning, highlighting the latter’s superior performance in capturing nonlinear relationships within optical datasets. Could such accessible, self-driving laboratories empower a new generation of physicists and engineers to seamlessly integrate advanced machine learning techniques into their research?

The Emergence of Spectral Control

The creation of desired light spectra using LED arrays presents a significant engineering challenge, stemming from the necessity of individually controlling the voltage supplied to each diode. Unlike single-source illumination, achieving a specific color mix demands meticulous calibration; each LED’s output is highly sensitive to even minor fluctuations in voltage. Because the intensity and wavelength of emitted light are directly correlated to the current flowing through the semiconductor material, precisely regulating this current-and thus the voltage-across dozens or even hundreds of LEDs is crucial. This isn’t merely a matter of setting a single overall brightness, but of fine-tuning the contribution of each diode to sculpt the final spectral power distribution, demanding sophisticated control algorithms and highly stable power delivery systems to ensure both accuracy and repeatability.

Conventional spectral matching techniques frequently rely on iterative adjustments of light sources or extensive calibration procedures, proving cumbersome and time-consuming when the desired spectrum needs to change rapidly. These methods often involve sequentially modifying individual components – filters, diffusers, or even entire light sources – to approximate the target spectrum, a process that struggles to keep pace with dynamic applications like hyperspectral imaging or real-time display adjustments. The inherent limitations of these approaches become particularly evident in scenarios demanding high spectral resolution and refresh rates, where the latency associated with traditional calibration and adjustment cycles can significantly degrade performance and introduce noticeable artifacts. Consequently, achieving efficient and precise spectral control in these evolving fields requires a paradigm shift toward more responsive and computationally agile methodologies.

The demand for precise spectral control extends beyond theoretical pursuits, becoming increasingly vital for technologies reliant on light manipulation. Advanced display systems, for instance, require nuanced color reproduction and dynamic adjustment of emitted spectra to achieve realistic imagery and expanded color gamuts. Simultaneously, scientific instrumentation – encompassing areas like spectroscopy, microscopy, and flow cytometry – depends on the ability to rapidly and accurately tune light sources for optimal signal acquisition and analysis. These applications necessitate not only the creation of specific wavelengths, but also the swift and repeatable modification of spectral output, pushing the boundaries of current light source technology and control systems to deliver improved performance and unlock new capabilities in diverse fields.

A dataset of 100,000 voltage-spectrum pairs was created by measuring individual LED spectral responses, generating synthetic composite spectra from randomized voltage vectors, and normalizing both input voltages and spectral outputs to accelerate deep learning controller training.

Building Blocks for Spectral Autonomy

The IoT platform consists of an LED array functioning as a programmable light source, a multispectral sensor for light analysis, and an Arduino microcontroller serving as the central processing unit. The Arduino facilitates communication between the sensor and the LED array, enabling closed-loop control. Data acquired by the multispectral sensor, representing the emitted light’s spectral characteristics, is processed by the Arduino. Based on this analysis, the Arduino adjusts the LED array’s output, creating a fully integrated system capable of autonomous spectral optimization and data logging. This combination of hardware components allows for real-time spectral adjustments and provides a foundation for applications requiring precise light control and spectral analysis.

Precise spectral manipulation is achieved through the implementation of a Digital-to-Analog Converter (DAC) which regulates the voltage supplied to individual LEDs within the array. This DAC allows for discrete voltage levels to be applied, effectively modulating the intensity of light emitted at specific wavelengths. The resolution of the DAC-specifically, the number of addressable voltage steps-directly correlates to the granularity of spectral control achievable; a higher resolution DAC enables finer adjustments to the emitted spectrum. By varying the voltage applied to each LED, the platform facilitates experimentation with customized spectral profiles for applications requiring specific wavelength compositions and intensities.

The developed IoT platform lowers the barrier to entry for research and development in spectral optimization due to its cost-effective design and adaptable architecture. Utilizing readily available components – an LED array, multispectral sensor, and Arduino microcontroller – the system provides a complete, functional testbed at a fraction of the cost of traditional laboratory equipment. This affordability enables wider accessibility for academic institutions, hobbyists, and small businesses to explore advanced techniques such as spectral shaping, dynamic lighting control, and plant-specific illumination strategies. Furthermore, the platform’s open-source nature and modular design facilitate customization and extension, allowing users to tailor the system to specific experimental requirements and integrate it with other IoT devices and data analysis pipelines.

This closed-loop IoT platform enables self-driving optical experiments by iteratively adjusting LED array voltage [latex] ext{V}[/latex] to generate spectra, sensing the resulting discrete bands with an AS7341 sensor, and comparing the measured response (F1-F8) to a target spectrum defined on a host computer, all managed by an Arduino microcontroller.

Algorithms in Search of Spectral Convergence

The research evaluated three algorithms – Traversal, Bayesian Optimization, and Deep Learning – to determine their effectiveness in optimizing spectral matching. The Traversal algorithm implemented a systematic, iterative search of the parameter space. Bayesian Optimization employed Gaussian Process Regression to model the objective function and guide the search, aiming to minimize the number of required iterations. Finally, Deep Learning utilized Convolutional Neural Networks trained on a synthetic dataset comprised of 100,000 voltage-spectrum pairs to learn the mapping between input voltage and target spectrum. Comparative analysis was performed to assess the convergence speed, accuracy, and data requirements of each approach.

Bayesian Optimization, employing Gaussian Process Regression, exhibited faster convergence compared to the Traversal Algorithm during spectral matching optimization. This efficiency stems from the Gaussian Process’s ability to model the objective function and intelligently select subsequent sampling points, reducing the number of iterations required to locate optimal parameters. Specifically, the algorithm leverages probabilistic modeling to balance exploration of the search space with exploitation of promising regions, leading to a more efficient search strategy than the Traversal Algorithm’s systematic, but less informed, approach. The observed reduction in iterations directly translates to decreased computational cost and faster optimization times.

Deep Learning models, specifically Convolutional Neural Networks (CNNs), exhibited superior performance in spectral reproduction compared to other tested algorithms. This accuracy was achieved through training on a synthetically generated dataset comprising 100,000 paired voltage and corresponding spectral data points. However, the CNN’s effectiveness is directly correlated to the volume of training data; the substantial dataset was a prerequisite for achieving high reproduction accuracy, indicating a data-intensive characteristic for this approach. Performance gains were not observed with smaller datasets, highlighting the need for extensive data to properly train the network and avoid underfitting.

Bayesian optimization progressively refines a surrogate model [latex]\mu(x)[/latex] of the spectrum-voltage relationship, demonstrated by a reduction in predictive uncertainty and convergence of experimentally obtained spectra (solid lines) to the desired target spectra (dashed lines) over 400 iterations.

The System Validated: Precision in Spectral Output

The core of the system’s functionality rested on the successful implementation of closed-loop control across all three tested algorithms – Deep Learning, Traversal, and Bayesian optimization. Each algorithm demonstrated the ability to iteratively adjust parameters and converge toward the desired target spectra, effectively manipulating the light emitted by the system. While all achieved this fundamental control, the degree of accuracy varied considerably between methods; some algorithms exhibited faster convergence and tighter alignment with the target, while others required more iterations and yielded comparatively broader spectral deviations. This successful convergence, even with differing levels of precision, validated the foundational framework of the closed-loop control system and established a basis for comparative analysis of algorithm performance in capturing complex spectral relationships.

Spectral data acquisition relied on the AS7341, an 8-channel multispectral sensor capable of discerning subtle variations in light across the visible and near-infrared spectrum. This compact sensor simultaneously measures ambient light intensity at distinct wavelengths – 415nm, 480nm, 560nm, 600nm, 670nm, 730nm, 760nm, and 850nm – providing a detailed spectral ‘fingerprint’ of the illuminated surface. The AS7341’s integration facilitated real-time, non-destructive analysis, proving crucial for the closed-loop control algorithms as it enabled rapid feedback on spectral characteristics. By capturing these discrete wavelengths, the system moved beyond simple RGB color detection to more nuanced spectral analysis, laying the foundation for precise spectral matching and correction.

The study revealed a significant performance advantage for the deep learning algorithm when compared to both traversal and Bayesian optimization methods in achieving precise spectral control. This superiority stems from the deep learning model’s capacity to effectively model the complex, nonlinear relationships inherent in the spectral data. While traversal and Bayesian approaches demonstrated functional closed-loop control, they struggled to capture the intricacies needed for high-accuracy spectral convergence; the deep learning model consistently achieved a lower error rate and faster convergence, validating its efficacy as a robust solution for applications demanding precise spectral manipulation and highlighting its potential for advanced spectral control systems.

A deep-learning controller utilizing a convolutional neural network with either a 32-64-128 or 64-128-256 filter configuration successfully optimizes LED voltages in a closed loop to achieve target spectra, with the deeper network demonstrating significantly improved accuracy and generalization.

The pursuit of autonomous experimentation, as detailed in this work, echoes a fundamental principle: order manifests through interaction, not control. This platform, leveraging IoT and machine learning for closed-loop optical experiments, doesn’t impose a rigid structure but allows it to emerge from the interplay of sensors, algorithms, and physical processes. As Sergey Sobolev noted, “The most effective control is often the art of letting things be.” This sentiment aligns perfectly with the system’s Bayesian optimization and deep learning components, which learn and adapt to the experiment’s dynamics rather than dictating them. Sometimes inaction, allowing the system to explore and refine its parameters, is the most potent tool in achieving robust and insightful results.

Emergent Systems and the Future of Experimentation

The presented work, while a practical demonstration of affordable automation, subtly highlights a deeper shift in how experimentation itself is conceived. The system is a living organism where every local connection matters; the elegance isn’t in imposing a grand design, but in allowing intelligent behavior to emerge from the interplay of inexpensive sensors and algorithms. Current limitations-the scope of optical spectroscopy, the specifics of Bayesian optimization-are less critical than the revealed principle: control is an illusion, influence is real. The focus will inevitably move towards platforms that aren’t programmed to discover, but evolve towards understanding.

Future iterations will likely abandon the notion of a ‘self-driving lab’ as a monolithic entity. Instead, one anticipates a proliferation of specialized, interconnected ‘nodes’ – each optimizing a narrow experimental parameter – communicating and competing to refine a broader model. The true challenge isn’t building more powerful algorithms, but designing systems that gracefully handle the inherent noise and ambiguity of real-world data, and that can learn from their failures as effectively as their successes.

Top-down control often suppresses creative adaptation. The promise of this approach lies in its potential to foster a more decentralized, resilient, and ultimately, more insightful form of scientific inquiry – one where the experiment guides the researcher, rather than the other way around. The next step isn’t simply automating existing procedures, but allowing the system to suggest, and even design, its own experiments.

Original article: https://arxiv.org/pdf/2604.13139.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Emergence of Spectral Control

Building Blocks for Spectral Autonomy

Algorithms in Search of Spectral Convergence

The System Validated: Precision in Spectral Output

Emergent Systems and the Future of Experimentation

See also: