The Last Mile, Automated: Building a Smarter Delivery Robot

Author: Denis Avetisyan

This review details the integrated design and development of an autonomous delivery robot leveraging AI-driven perception and real-time control systems.

A unified approach to robotic design combining AI path planning, embedded systems, ROS 2, and heterogeneous computing architectures for robust autonomous navigation and package delivery.

Achieving truly autonomous navigation demands seamless integration of high-level intelligence with precise, real-time control-a traditionally challenging synthesis. This paper details ‘A Unified AI, Embedded, Simulation, and Mechanical Design Approach to an Autonomous Delivery Robot’, presenting a fully integrated platform leveraging heterogeneous computing-specifically, ROS 2 for AI-driven perception and path planning coupled with a FreeRTOS-based ESP32 for deterministic motor control. The resulting system demonstrates robust navigation and payload delivery capabilities through optimized mechanical design, rigorous memory management, and a failsafe architecture monitored via AWS IoT. Could this unified, multi-disciplinary methodology represent a scalable blueprint for future autonomous robotic systems operating in complex, real-world environments?

The Inevitable Convergence of Commerce and Automation

The escalating demands of modern commerce and the rise of e-commerce have created a significant bottleneck in last-mile delivery – the final leg of the journey from a transportation hub to the customer’s doorstep. This phase represents a disproportionately high percentage of overall shipping costs and is plagued by inefficiencies such as failed deliveries, traffic congestion, and a shortage of human drivers. Consequently, there is increasing pressure to find scalable and cost-effective solutions, and robotic delivery systems are emerging as a pivotal technology to address these challenges. The need isn’t simply about automating existing processes; it’s about fundamentally rethinking logistics to handle the increasing volume of packages with greater reliability and reduced environmental impact, positioning autonomous robots as crucial components of future supply chains.

Existing autonomous delivery systems frequently struggle when faced with the unpredictable nature of real-world scenarios. Many designs prove brittle when encountering obstacles not explicitly programmed, or fail to adjust to shifting pedestrian traffic and evolving urban landscapes. Current robotic platforms often rely on meticulously mapped environments and pre-defined routes, limiting their operational scope and necessitating constant human intervention when deviations occur. This lack of robustness stems from difficulties in sensor data interpretation, inadequate algorithms for dynamic path planning, and limited capacity to learn from unexpected events – hindering the widespread adoption of autonomous delivery in truly complex, dynamic environments.

The newly developed Autonomous Delivery Robot represents a significant advancement in logistical automation, achieved through a tightly integrated hardware and software system. This design moves beyond the limitations of existing robotic delivery platforms by combining a robust, all-terrain chassis with sophisticated perception and navigation algorithms. The robot utilizes a suite of sensors – including LiDAR, cameras, and ultrasonic detectors – to build a detailed map of its surroundings and dynamically adjust its path to avoid obstacles. Crucially, the software architecture emphasizes modularity and scalability, allowing for easy adaptation to varying delivery scenarios and future feature enhancements. This cohesive approach not only enables reliable navigation in complex, real-world environments, but also lays the foundation for a cost-effective and versatile delivery solution.

The design of this autonomous delivery robot centers on principles of safety, efficiency, and scalability to facilitate practical implementation in varied environments. Rigorous testing confirms the robot’s ability to reliably transport a 15 kg payload, representing a substantial capacity for common delivery applications. This achievement is enabled by a cohesive integration of advanced hardware and intelligent software, ensuring stable navigation and obstacle avoidance even in dynamic, unpredictable settings. Furthermore, the system is architected for scalability, allowing for fleet expansion and adaptation to diverse logistical demands, positioning it as a viable solution for last-mile delivery challenges.

Strategic Heterogeneity: A Computational Decomposition

The system architecture utilizes a heterogeneous computing approach by integrating the Raspberry Pi 5 and the ESP32 microcontroller. This design strategically assigns tasks to each processor based on their respective capabilities; the Raspberry Pi 5, with its ARM Cortex-A76 processor, manages complex computations, while the ESP32 handles time-critical operations. Communication between the two processors is facilitated via a serial interface, enabling data exchange and coordinated control. This division of labor allows for efficient resource allocation and optimizes the system for both high-level cognitive tasks and precise, low-latency physical control.

The Raspberry Pi 5 is utilized for computationally intensive tasks including AI-powered perception and path planning due to its significant processing capabilities. These functions require substantial floating-point operations and parallel processing, areas where the Raspberry Pi 5’s quad-core ARM Cortex-A76 processor excels. Specifically, the Pi 5 handles sensor data fusion, object recognition from camera input, and the generation of optimal navigation trajectories. Offloading these tasks from the ESP32 frees the microcontroller to focus on critical, time-sensitive operations and contributes to a more efficient system architecture.

The ESP32 microcontroller is utilized for direct, real-time control of robotic actuators and data acquisition due to its inherent deterministic behavior. This ensures predictable timing for critical functions like motor command execution and sensor readings, independent of higher-level software processes running on the Raspberry Pi 5. The ESP32’s architecture minimizes latency and jitter, allowing for precise control loops and responsive system behavior. Specifically, its capabilities facilitate high-frequency sampling of sensor data and rapid adjustments to motor outputs, critical for maintaining stability and accuracy in dynamic environments. This dedicated hardware control loop operates separately from the main processing unit, preventing delays caused by operating system scheduling or computationally intensive tasks.

The system architecture distributes computational load to optimize both power usage and performance. The ESP32 is dedicated to executing a deterministic control loop, specifically a Proportional-Integral-Derivative (PID) controller, at a high frequency. This ensures precise and timely responses to sensor data and enables real-time motor control. By offloading this critical, time-sensitive task from the Raspberry Pi 5, the overall system reduces latency and power draw, as the Raspberry Pi 5 can then focus on less time-critical, higher-level processing functions like perception and path planning. This division of labor allows each processor to operate more efficiently within its designated role.

Real-Time Kinematics and Environmental Mapping

The ESP32 microcontroller, operating under the FreeRTOS real-time operating system, is responsible for direct motor control via Proportional-Integral-Derivative (PID) control loops. This implementation allows for precise regulation of motor speed and position by continuously calculating the error between a desired setpoint and the actual motor output. The PID controller adjusts the motor’s power based on proportional, integral, and derivative terms of this error, minimizing overshoot and oscillation and ensuring smooth and accurate movements. FreeRTOS task scheduling prioritizes these low-level control functions, guaranteeing responsiveness and deterministic behavior crucial for robotic applications.

A UART serial link provides communication between the Raspberry Pi 5 and the ESP32 microcontroller. This connection utilizes the Universal Asynchronous Receiver/Transmitter protocol, enabling full-duplex, asynchronous data transmission. The chosen serial interface prioritizes simplicity and reliability for transmitting control signals from the Raspberry Pi 5 to the ESP32, and for receiving sensor data and status updates in return. Data transfer rates were optimized to maintain real-time responsiveness while ensuring data integrity between the two processing units.

The Raspberry Pi 5 implements Simultaneous Localization and Mapping (SLAM) using the GMapping algorithm to construct maps and determine its position within an environment. This is achieved by fusing data from a LiDAR sensor, which provides range measurements, and an Inertial Measurement Unit (IMU), which measures angular velocity and linear acceleration. The ROS 2 framework manages the data processing and algorithm execution, enabling robust localization and mapping capabilities even in dynamic or visually sparse environments. The integrated sensor data allows for accurate pose estimation and the creation of detailed 2D maps for navigation and environmental understanding.

The Raspberry Pi 5 achieves environmental perception by concurrently running DepthAnything 2 and YOLOv11, utilizing data streams from the Astra Pro RGB-D camera. This system integrates point cloud data from LiDAR, inertial measurements from the IMU, and RGB-D imagery to create a comprehensive understanding of the surrounding environment. A low-latency communication link was implemented between the ROS 2 framework on the Raspberry Pi 5 and the ESP32-based embedded controller, ensuring timely data exchange for coordinated control and perception tasks.

Robustness Through Redundancy and Persistent Operation

The robot’s power system employs a dual-battery architecture specifically designed to enhance operational stability and minimize electronic noise. This configuration segregates the sensitive microcontroller and sensor circuitry from the high-current demands of the motors, preventing voltage drops and interference that could compromise performance or data accuracy. By isolating the motor circuit, the system ensures a clean and consistent power supply to critical components, leading to more reliable operation and precise control. This design choice not only improves the robot’s responsiveness but also safeguards against potential damage caused by electrical fluctuations, contributing to a more robust and dependable platform.

To guarantee operational safety, the robotic system incorporates multiple failsafe mechanisms designed to immediately shut down motor function in the event of communication disruption or systemic failure. This isn’t simply a pause in operation; rather, a complete and sustained cessation of power to the motors is initiated. This protective measure relies on dedicated hardware monitoring the communication link with the control system; should this link be severed, or if critical system errors are detected, the motors are instantly de-energized. Furthermore, a watchdog timer independently monitors the system’s core functions, triggering a shutdown if the software becomes unresponsive, preventing uncontrolled movement or potentially hazardous behavior. This redundancy ensures the robot remains stable and safe, even in unforeseen circumstances, prioritizing the prevention of damage or injury.

To guarantee dependable, continuous operation, the robot’s core programming utilizes static memory allocation on its ESP32 microcontroller. Unlike dynamic allocation, where memory is assigned and released during runtime, static allocation pre-assigns a fixed amount of memory for all essential functions. This preemptive approach effectively eliminates the risk of memory fragmentation – a common issue in long-running applications where repeated allocation and deallocation create scattered, unusable memory blocks. By reserving memory upfront, the system avoids performance degradation and potential crashes that can occur when attempting to allocate larger blocks of memory from a fragmented heap, thereby significantly improving the robot’s long-term reliability and suitability for extended deployments without interruption.

This robotic system extends beyond localized operation through integration with Amazon Web Services (AWS) IoT, establishing a robust platform for remote oversight and data-driven insights. Continuous telemetry, encompassing sensor readings and operational status, is transmitted to the AWS cloud for real-time monitoring and historical analysis. This data logging capability not only facilitates performance evaluation and fault diagnosis but also unlocks possibilities for predictive maintenance and iterative design improvements. Furthermore, the AWS IoT infrastructure provides a scalable foundation for future advancements, including over-the-air updates, remote control functionalities, and, crucially, the potential to manage and coordinate fleets of these robots as a cohesive, interconnected system – paving the way for collaborative robotics and automated task allocation.

Adaptive Intelligence and the Trajectory of Autonomous Logistics

The robot’s navigational prowess stems from an artificial intelligence built upon the principles of Reinforcement Learning. This allows the system to move beyond pre-programmed routes and instead learn the most efficient paths through complex environments. Through repeated trials and a system of rewards for successful navigation – and penalties for collisions or delays – the AI refines its ‘policy’ for movement. Crucially, this isn’t simply memorization; the robot adapts in real-time to dynamic changes, such as moving obstacles or altered pedestrian traffic. By continuously learning from experience, the system optimizes for speed, safety, and efficiency, paving the way for truly autonomous operation in unpredictable, real-world settings.

The system’s design prioritizes adaptability through a modular software architecture built upon a robust hardware platform. This allows for the seamless integration of novel sensors – such as lidar, ultrasonic detectors, or even advanced cameras – without requiring a complete overhaul of the existing framework. Furthermore, new functionalities, including improved object recognition, predictive path planning, or enhanced safety protocols, can be readily incorporated as software modules. This scalability ensures the robot isn’t confined to its current capabilities; instead, it’s positioned as a continually evolving platform capable of addressing increasingly complex logistical challenges and benefiting from ongoing advancements in robotics and artificial intelligence.

The successful development of this autonomous delivery robot signifies a potential paradigm shift in last-mile logistics. Current delivery systems, heavily reliant on human drivers and fixed routes, often face challenges related to cost, efficiency, and scalability. This robotic platform addresses these issues by offering a flexible, on-demand delivery solution capable of navigating complex urban environments. Through automation, it promises to reduce delivery times, lower operational costs, and minimize the environmental impact associated with traditional transportation methods. Further refinement and wider deployment could unlock substantial improvements in supply chain management, ultimately enhancing convenience for consumers and driving economic growth by optimizing the movement of goods.

The current research represents a foundational step, and subsequent efforts are geared towards translating this success into practical application through system scaling and real-world deployment. This involves rigorous testing in increasingly complex environments, expanding the robot’s operational range beyond controlled laboratory settings, and addressing challenges inherent in unpredictable public spaces – such as varied terrain, pedestrian traffic, and adverse weather conditions. Researchers are also investigating strategies for fleet management, including optimized route planning for multiple robots, efficient charging infrastructure, and robust remote monitoring capabilities. Ultimately, the goal is to move beyond proof-of-concept and establish a reliable, cost-effective autonomous delivery solution capable of transforming last-mile logistics and enhancing urban accessibility.

The pursuit of a fully autonomous delivery robot, as detailed in this work, necessitates a rigorous approach to information processing. It demands not merely a system that functions in controlled environments, but one predicated on demonstrable correctness. As Claude Shannon famously stated, “The most important thing in a good piece of communication is that it is understood.” This principle extends directly to the robot’s sensor fusion and AI path planning; the system must accurately interpret its surroundings and translate that data into reliable action. The elegance of the heterogeneous computing architecture lies in its potential to minimize ambiguity, ensuring the robot’s ‘communication’ with the physical world is clear, concise, and free from contradiction, ultimately leading to robust and predictable behavior. The aim is a provably correct system, not just one that passes tests.

What’s Next?

The presented work, while demonstrating a functional integration of perception, planning, and control, merely scratches the surface of true autonomy. The reliance on ROS 2, a framework predicated on message passing, introduces inherent latency – a practical compromise, certainly, but one which obscures the pursuit of provably correct real-time systems. Future iterations must address this architectural limitation, perhaps through exploration of formally verified control software executed directly on embedded hardware. The asymptotic complexity of the path planning algorithm, currently reliant on heuristic search, demands scrutiny; optimality guarantees, not merely empirically observed performance, represent the logical next step.

Furthermore, the sensor fusion architecture, while robust in controlled environments, lacks a formal treatment of uncertainty propagation. Noise models are, at present, largely ad hoc. A rigorous Bayesian framework, incorporating both sensor noise and model inaccuracies, is essential to establish bounds on localization error – and, consequently, to define the operational domain within which truly safe autonomous delivery is possible. The current reliance on reactive obstacle avoidance, while effective, is fundamentally limited; predictive collision avoidance, grounded in a complete and accurate world model, remains a significant challenge.

Ultimately, the pursuit of autonomous systems is not an engineering problem, but a mathematical one. The field requires a shift in emphasis – from ‘making it work’ to ‘proving it correct’. The elegance of a solution lies not in its empirical performance, but in the demonstrable truth of its invariants. The current work provides a foundation, but the true measure of success will be the ability to express the entire system as a formally verifiable entity.

Original article: https://arxiv.org/pdf/2512.22408.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/