Building Better Collisions: A New Approach to Differentiable Physics

Author: Denis Avetisyan


This review details a novel method for generating smooth, vectorizable contact manifolds, crucial for enabling gradient-based optimization in physics simulations.

The contact manifold demonstrates the range of stable states achievable through controlled interaction, defining the boundaries within which consistent contact can be maintained.
The contact manifold demonstrates the range of stable states achievable through controlled interaction, defining the boundaries within which consistent contact can be maintained.

The paper introduces algorithms for differentiable collision detection utilizing extruded plane-superquadric intersections and signed distance function-based contact manifold construction.

Despite advances in robot learning, generating robust and efficient behaviors in contact-rich environments remains challenging due to difficulties in obtaining accurate gradients for dynamics-based planning. This paper, ‘Novel Algorithms for Smoothly Differentiable and Efficiently Vectorizable Contact Manifold Construction’, addresses a key bottleneck-collision detection-by introducing a novel approach to constructing smoothly differentiable and efficiently vectorizable contact manifolds. The core contribution lies in a combination of highly expressive extruded plane-superquadric intersection (XPSQ) primitives and a mesh-SDF generation routine, enabling more effective use of first and second-order optimization methods. Will this approach pave the way for significantly faster and more stable robot control in complex, real-world scenarios?


The Core Challenge: Representing Contact with Fidelity

The fidelity of physics simulations and the reliable operation of robotic systems hinge fundamentally on accurately representing physical contact. These applications-spanning virtual reality, game development, and industrial automation-demand a precise understanding of how objects interact when they touch, including the location, magnitude, and distribution of forces. Inaccurate contact representation introduces errors that propagate through the simulation or control loop, leading to unrealistic behavior, instability, and potentially even physical damage in real-world robotic deployments. Consequently, significant research effort is dedicated to developing methods that can efficiently and reliably model contact between objects, even those with intricate geometries and complex material properties, because a robust understanding of these interactions is paramount for achieving believable and functional results.

Historically, simulating physical interactions relied heavily on breaking down complex shapes into simpler, convex components – a process known as convex decomposition. However, this approach encounters significant difficulties when dealing with objects possessing intricate, non-convex geometries, such as those with concavities or internal features. The proliferation of these simpler shapes introduces inaccuracies in contact detection and force calculations, ultimately diminishing the realism of the simulation. Furthermore, the computational cost of managing a large number of convex approximations scales rapidly with complexity, creating performance bottlenecks that limit the feasibility of simulating highly detailed environments or real-time robotic interactions. Consequently, researchers are actively exploring alternative methods to represent contact more efficiently and accurately, bypassing the limitations inherent in traditional convex decomposition techniques.

Sphere tracing determines intersection points [latex]p_{I}[/latex] and [latex]p_{II}[/latex] between the signed distance function (SDF) of one surface and an edge of another, iteratively moving candidate points towards the SDF by [latex] \pm \llbracket \phi(x) > 0 \rrbracket \phi(x)e_{t}[/latex], while preventing movement if either edge vertex is already inside the surface.
Sphere tracing determines intersection points [latex]p_{I}[/latex] and [latex]p_{II}[/latex] between the signed distance function (SDF) of one surface and an edge of another, iteratively moving candidate points towards the SDF by [latex] \pm \llbracket \phi(x) > 0 \rrbracket \phi(x)e_{t}[/latex], while preventing movement if either edge vertex is already inside the surface.

Constructing Contact: From Distance Fields to Manifolds

Signed Distance Functions (SDFs) represent a surface as a scalar field where the value at any given point in space is the shortest signed distance to the surface; negative values indicate points inside the surface, positive values indicate points outside, and zero values lie directly on the surface. This allows for efficient proximity queries because the distance value directly provides the shortest distance to the geometry without requiring complex geometric calculations like ray-triangle intersection tests. The SDF representation is particularly advantageous for collision detection and physics simulations, as determining penetration depth and contact normals can be achieved by evaluating the SDF at nearby points. Furthermore, SDFs can represent complex shapes, including those with sharp features and concavities, and are well-suited for implicit surface modeling and procedural generation.

The Edge-SDF Collision Routine utilizes sphere tracing in conjunction with Signed Distance Fields (SDFs) to efficiently detect potential contact points between objects. This method involves casting spheres along a defined search direction and evaluating the SDF at each intersection; a value of zero indicates a surface hit. By leveraging the SDF’s ability to represent distance to the nearest surface, the routine can rapidly narrow down potential contact locations without requiring computationally expensive triangle intersection tests common in mesh-based collision detection. This approach demonstrably improves performance, particularly in scenarios with complex geometry or high collision frequencies, as SDF evaluation is significantly faster than traditional methods.

The Contact Manifold represents a precise discretization of the contacting surfaces following the identification of initial contact candidates. This manifold is constructed by evaluating the penetration depth at numerous points along the potential contact region and iteratively refining the surface normals and contact locations. The resulting data structure consists of a set of connected polygons, each representing a portion of the actual contact surface, and provides accurate information regarding the area, normal direction, and depth of penetration at each contact point. This detailed representation is crucial for accurate constraint resolution and stable physical simulations, enabling the calculation of impulse and friction forces acting at the contact interface.

The signed distance function for the XPSQ primitive is computed by projecting a point [latex]m{x}[/latex] onto a spline [latex]p(t)[/latex], transforming the projection to align with three points, and then finding the smooth minimum of the resulting signed distance functions.
The signed distance function for the XPSQ primitive is computed by projecting a point [latex]m{x}[/latex] onto a spline [latex]p(t)[/latex], transforming the projection to align with three points, and then finding the smooth minimum of the resulting signed distance functions.

Achieving Stability: Smoothing for Robust Gradient Estimation

Stable contact resolution in simulation relies heavily on accurate gradient calculations for witness point positions, which define the locations where contact forces are applied. Direct computation of these gradients, however, often yields noisy results due to the discrete nature of collision detection and the sensitivity to small perturbations in geometry. This noise can lead to instability in the simulation, manifesting as jittering or unrealistic behavior. Consequently, techniques to reduce gradient variance are essential for robust contact handling; these methods aim to provide a more reliable signal for optimization algorithms that adjust witness point locations to maintain contact.

Gradient estimation for witness point positions can be unstable due to noise; therefore, approximation techniques are employed. Randomized Smoothing approximates gradients by adding small, random perturbations to the input and averaging the resulting gradients. Conversely, Analytical Smoothing formulates the gradient calculation as a convex optimization problem, solved using techniques like constrained optimization. This approach leverages the properties of convex functions to yield a more stable and accurate gradient estimate, though it typically requires more computational resources than Randomized Smoothing. Both methods aim to reduce variance in the gradient calculation without significantly altering the underlying optimization process.

The Log-Barrier Coefficient, denoted as ρ, is a crucial parameter within the analytical smoothing process used for gradient estimation. It controls the strength of the logarithmic barrier function added to the contact constraint, effectively trading off constraint satisfaction against smoothness of the optimization landscape. A higher ρ value enforces stronger constraint adherence but can lead to increased sensitivity to perturbations and potentially unstable optimization. Conversely, a lower value promotes a smoother, more stable optimization process at the cost of potentially violating contact constraints. Optimal performance requires careful tuning of ρ based on the specific contact scenario and solver characteristics, balancing accuracy and stability for robust gradient estimation.

The SoftJAX library provides differentiable approximations of common functions which are essential for implementing smoothing techniques used in gradient estimation. Specifically, the [latex]\text{Sigmoid Function}[/latex] and [latex]\text{Softplus Function}[/latex] offer smooth, differentiable alternatives to the step function and ReLU, respectively, enabling gradient-based optimization where these functions would otherwise be non-differentiable. The [latex]\text{Logsumexp Function}[/latex] is utilized to maintain numerical stability during the computation of logarithmic sums, a frequent operation in convex program solving and smoothing formulations. These approximations allow for efficient computation of gradients through automatic differentiation, simplifying the implementation of randomized and analytical smoothing methods without requiring manual derivation of adjoints or complex numerical schemes.

Realizing Impact: Performance and Scalability Gains

The methodology achieves notable performance improvements by integrating Signed Distance Field (SDF)-based collision detection with smoothed gradient estimation. Traditional collision detection often relies on complex geometric representations and computationally expensive calculations to determine intersections between objects; however, SDFs provide a continuous representation of shape, enabling efficient distance queries and penetration depth calculations. Crucially, combining this with smoothed gradient estimation – a technique that refines the accuracy of gradient calculations used in physics simulations – mitigates instability and allows for larger simulation steps. This pairing dramatically reduces the computational burden associated with both collision resolution and constraint enforcement, resulting in a more responsive and efficient simulation framework capable of handling complex scenes and interactions with greater stability and speed.

Scalability within complex physical simulations is significantly improved through techniques like ‘Vectorization Batch Size,’ which leverage the power of parallel processing. This approach moves beyond sequentially solving individual contact problems – instances where objects collide and interact – and instead formulates them as a collective batch. By processing multiple contact scenarios simultaneously, the system capitalizes on the inherent parallelism of modern hardware, particularly GPUs. This dramatically reduces the overall computation time, as the overhead associated with setting up and solving each problem individually is amortized across the entire batch. The efficiency gains are substantial, allowing simulations to handle increasingly complex scenes and a greater number of interacting objects without sacrificing performance, ultimately enabling real-time or near-real-time interactive experiences.

The efficiency of simulating complex physical interactions hinges on representing geometry in a computationally tractable manner. This work introduces the XPSQ primitive, a novel approach that blends the strengths of superquadrics and quadratic splines to achieve an optimal balance between geometric accuracy and processing speed. Unlike traditional methods, such as convex decomposition, which require a substantial number of primitives to approximate a shape-like the 36 needed to represent a simple cup-the XPSQ primitive achieves comparable visual fidelity with significantly fewer components; a cup can be accurately modeled with only 3 primitives. This reduction in geometric complexity directly translates to lower computational costs in collision detection and rendering, offering a substantial performance advantage, as demonstrated by the method’s ability to represent an armadillo with just 18 primitives.

Performance benchmarks reveal a substantial advancement in simulation efficiency through this novel approach. Compared to the established MJX Simulator, the presented methodology achieves an order of magnitude speed improvement – a tenfold increase in processing capability. This gain isn’t simply computational; it’s also reflected in geometric representation. Complex objects require dramatically fewer primitives for accurate modeling; for instance, a cup is rendered using only three primitives, a significant reduction from the 36 required by convex decomposition with CoACD. Similarly, an armadillo – a notoriously challenging shape for simulation – is accurately represented with just 18 primitives using Superquadrics, demonstrating a powerful ability to balance fidelity and computational cost.

XPSQs demonstrate expressivity by successfully executing a diverse range of challenging manipulation tasks.
XPSQs demonstrate expressivity by successfully executing a diverse range of challenging manipulation tasks.

The pursuit of efficient collision detection, as detailed in this work, mirrors a fundamental principle: abstractions age, principles don’t. This paper’s focus on vectorizable contact manifold construction using XPSQ primitives and mesh-SDF routines isn’t about adding complexity, but distilling the process to its core. Andrey Kolmogorov observed, “The shortest path between two points is a straight line.” Similarly, the algorithms presented aim for a direct route to accurate and efficient collision handling. The elegance lies in removing unnecessary layers, achieving performance gains through focused mathematical foundations. Every complexity needs an alibi; here, simplicity is the justification.

Where to Next?

The presented work offers a reduction – a streamlining of differentiable collision detection. Yet, elegance in algorithm design merely highlights the messiness of the underlying problem. The reliance on extruded plane-superquadrics, while computationally advantageous, introduces a geometric constraint. Future investigations must address the inherent limitations of this simplification, exploring alternative primitive sets that balance computational tractability with representational fidelity. The question is not whether more complex shapes can be modeled, but whether the added complexity justifies the cost.

True progress lies not in proliferating parameters, but in minimizing them. Vectorization, while beneficial, remains a proxy for genuine parallelization. The current approach implicitly assumes a certain level of geometric regularity in the scene. The next challenge is to extend this efficiency to scenarios with high geometric diversity, perhaps through learned representations or adaptive primitive selection. A truly robust system will not strive to capture all complexity, but to ignore what is irrelevant.

Ultimately, the pursuit of perfect collision detection is a fool’s errand. Contact, at the level of real-world interaction, is inherently noisy and imprecise. The goal, then, should not be to eliminate error, but to embrace it – to design algorithms that are resilient to uncertainty and capable of extracting meaningful information from imperfect data. Simplicity, in this context, is not a constraint, but a form of respect – toward both the machine and the observer.


Original article: https://arxiv.org/pdf/2604.17538.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2026-04-21 16:56