Decoding AI Regulations: Can Explainable AI Pass the EU Test?

Author: Denis Avetisyan

A new framework maps the requirements of the EU AI Act to the capabilities of explainable AI methods, offering a path towards demonstrable regulatory compliance.

The framework assesses explainable AI methods by aligning their inherent properties with the legal requirements of the AI Act, ultimately establishing a compliance ranking based on demonstrable explanatory power.

This review assesses model-agnostic XAI techniques against the legal standards outlined in the EU AI Act, providing a quantifiable compliance scoring approach.

Despite growing interest in Explainable AI (XAI), a clear link between XAI techniques and emerging legal standards remains elusive. This paper, ‘Assessing Model-Agnostic XAI Methods against EU AI Act Explainability Requirements’, addresses this gap by evaluating commonly used, model-agnostic XAI methods against the specific requirements outlined in the EU AI Act. We propose a novel scoring framework that translates qualitative expert assessments of XAI properties – including interpretability, faithfulness, and robustness – into quantifiable measures of regulatory compliance. By bridging this divide, can we establish a robust, standardized approach to demonstrating the legal defensibility of AI systems?

The Inevitable Demand for Algorithmic Transparency

As artificial intelligence systems become increasingly integrated into critical decision-making processes – from loan applications and healthcare diagnoses to criminal justice and autonomous vehicles – the need for transparency and justification in their outputs grows paramount. This isn’t simply a matter of ethical consideration, but a fundamental requirement for building trust and ensuring accountability. The ‘black box’ nature of many AI algorithms – particularly deep learning models – obscures the reasoning behind their conclusions, making it difficult to identify biases, errors, or unfair outcomes. Consequently, stakeholders – including individuals affected by AI decisions, regulators, and developers themselves – demand insight into why a particular outcome was reached, fostering a shift towards AI systems that are not only accurate but also interpretable and demonstrably fair. This push for explainability isn’t merely about understanding the technical workings of an algorithm; it’s about establishing a clear chain of reasoning that connects inputs to outputs, allowing for scrutiny, validation, and ultimately, responsible deployment of these powerful technologies.

The burgeoning field of artificial intelligence is increasingly subject to legal scrutiny, prompting a demand for explainable AI (XAI) systems. Landmark regulations, including the European Union’s AI Act, the General Data Protection Regulation (GDPR), and the Medical Device Regulation (MDR), are now establishing a legal requirement for transparency in automated decision-making processes. These laws aren’t simply asking for descriptions of how an AI arrives at a conclusion, but rather demanding justification that is auditable and understandable, particularly when those decisions impact individuals’ rights or safety. Failure to comply with these regulations can result in substantial fines and legal repercussions, incentivizing developers to prioritize XAI not merely as a technical feature, but as a fundamental aspect of responsible AI deployment and legal adherence.

The burgeoning field of Explainable AI (XAI) isn’t simply about generating some rationale for an AI’s decision; legal frameworks increasingly demand explanations possessing specific qualities. Faithfulness ensures the explanation accurately reflects the true reasoning process within the AI, avoiding misleading post-hoc rationalizations. Robustness dictates that these explanations remain consistent even with minor variations in input data, preventing spurious justifications. Crucially, explanations must also be understandable – presented in a way that is accessible and meaningful to the intended audience, whether legal professionals, regulators, or the individuals directly impacted by the AI’s output. Meeting these combined demands-faithfulness, robustness, and understandability-is no longer a technical challenge but a legal prerequisite for deploying AI systems responsibly and avoiding potential liabilities.

Sensitivity analysis reveals that compliance scores for faithfulness and robustness-as required by Articles 86, 13-14 of the AI Act-are key metrics for evaluating AI system adherence to regulatory standards.

Deciphering the Core Properties of Explainable Systems

Faithfulness in Explainable AI (XAI) refers to the degree to which an explanation accurately reflects the internal logic of the machine learning model. It is distinct from plausibility; an explanation can seem reasonable to a human observer without actually representing how the model arrived at its prediction. Evaluating faithfulness involves verifying that the features or reasoning highlighted in the explanation genuinely influenced the model’s output, often through techniques like ablation studies or sensitivity analysis. A lack of faithfulness can mislead users, leading to inappropriate trust or incorrect debugging of the model. Crucially, high faithfulness does not guarantee interpretability; an explanation can be accurate but still too complex for a human to readily understand.

Robustness in Explainable AI (XAI) refers to the consistency of explanations when subjected to small perturbations in the input data or the model itself. A robust explanation will not drastically change with minor variations; this stability is critical to prevent adversarial manipulation, where malicious actors could alter inputs to generate desired explanations without affecting the model’s actual prediction. Assessing robustness typically involves measuring the sensitivity of explanation features – such as feature importance scores or highlighted input regions – to these small changes. Techniques to improve robustness include regularization methods during model training and post-hoc explanation smoothing, which aim to minimize fluctuations in explanations caused by input noise or model instability.

Explanation complexity directly impacts human comprehension and usability. While a complete account of a model’s decision-making process may be theoretically desirable, explanations exceeding a user’s cognitive capacity are ineffective. Research indicates that simpler explanations, even if they omit some detail, are more readily understood and trusted, leading to better decision-making by the end user. The optimal level of complexity is contingent on the user’s expertise and the specific application; however, prioritizing conciseness and clarity is essential for maximizing the benefit of explainable AI (XAI) systems. Excessive detail, technical jargon, or convoluted logic hinders understanding and diminishes the value of the explanation.

Evaluating the quality of Explainable AI (XAI) necessitates assessing faithfulness, robustness, and complexity not as isolated characteristics, but as interconnected properties. A highly faithful explanation, accurately reflecting model behavior, may be rendered useless if it fluctuates unpredictably with minor input perturbations, demonstrating a lack of robustness. Conversely, a robust but unfaithful explanation provides a stable but misleading account of the model’s decision-making process. Similarly, even faithful and robust explanations are ineffective if their complexity exceeds human comprehension limits. Therefore, a complete evaluation framework must consider the interplay between these properties; optimizing for one in isolation can negatively impact others, and a truly effective XAI system requires a balanced approach to all three.

A Comparative Assessment of XAI Methodologies

Local approximation methods, including SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations), strive to provide faithfulness by explaining individual predictions based on a simplified, locally linear model of the complex underlying function. However, these methods demonstrate limited robustness due to their sensitivity to small perturbations in the input data. Specifically, minor changes to an instance can result in significantly different local approximations and, consequently, varying feature importance scores. This instability arises from the reliance on sampling or perturbation techniques to estimate the local model, meaning explanations are not necessarily consistent across similar inputs and may not accurately reflect global model behavior.

Rule-based explanation methods, including Anchors and RuleSHAP, function by identifying sufficient conditions – or “rules” – that reliably lead to a specific prediction. These methods deliberately prioritize the generation of human-understandable rules, often limiting the complexity of the identified conditions to enhance interpretability. However, this emphasis on simplicity can result in a trade-off with faithfulness; the identified rules may not fully capture the complete reasoning process of the original model, and therefore may not accurately represent all factors contributing to the prediction. The resulting explanations, while easy to comprehend, may be approximations that omit nuanced or complex interactions within the model.

Counterfactual explanations, generated by methods like CEM (Counterfactual Explanation Method) and DiCE (Diverse Counterfactual Explanations), identify the minimal changes to an input feature set that would alter a model’s prediction to a desired outcome. While useful for understanding decision drivers, these explanations are susceptible to noise in the data or model, potentially leading to unrealistic or unstable counterfactuals. Interpretation requires caution as seemingly minor perturbations identified as crucial may not represent genuine causal relationships or may be specific to the instance being analyzed. Furthermore, the search for counterfactuals can be computationally expensive and may not always yield plausible or actionable insights, particularly in high-dimensional feature spaces.

Partial Dependence Plots (PDP) and Individual Conditional Expectation (ICE) plots are utilized to understand the marginal effect of one or more features on the predicted outcome of a machine learning model. PDPs display the average predicted outcome across the entire population for different values of the feature(s) of interest, effectively summarizing the overall relationship. ICE plots, conversely, show the predicted outcome for each individual instance as the feature(s) vary, revealing heterogeneous effects not captured by the average in PDPs. However, both methods do not provide a complete explanation for individual predictions, as they only isolate the feature(s) of interest and assume independence from other features; interactions between features are not directly accounted for, and the contribution of other features remains unaddressed, limiting their ability to fully explain any single prediction.

Towards a Standardized Framework for XAI Compliance

The Mixed-Methods Scoring Framework utilizes both quantitative metrics and qualitative assessment to provide a comprehensive evaluation of Explainable AI (XAI) methods. Quantitative evaluation leverages measurable properties such as fidelity, stability, and computational cost, providing objective scores for each XAI technique. This is then supplemented by qualitative assessment, conducted by domain experts, to evaluate aspects like intelligibility, actionability, and trustworthiness of the explanations produced. Combining these approaches addresses the limitations of relying solely on automated metrics, which may not fully capture the nuanced requirements for human understanding and responsible AI deployment. The resulting composite score reflects a holistic evaluation, considering both the technical performance and the human-centered qualities of each XAI method.

The Compliance Score, a normalized value between 0 and 1, quantitatively assesses the degree to which an Explainable AI (XAI) method aligns with key tenets of relevant legal frameworks including the AI Act, General Data Protection Regulation (GDPR), and Medical Device Regulation (MDR). A score of 1 indicates full alignment, demonstrating the XAI method satisfies the requirements for transparency and explainability as defined within these regulations; conversely, a score of 0 signifies no alignment. This metric is derived from a weighted evaluation of faithfulness, robustness, and complexity characteristics of the XAI method, providing a single, interpretable indicator of its regulatory compliance potential. The score facilitates objective comparison between different XAI techniques and supports documentation required for demonstrating due diligence in AI system deployment.

The XAI scoring framework evaluates explanation quality through three primary dimensions: faithfulness, robustness, and complexity. Faithfulness assesses the degree to which an explanation accurately reflects the model’s decision-making process, quantifying the alignment between explanation and model behavior. Robustness measures the stability of explanations to minor perturbations in the input data, indicating the reliability of the explanation under realistic conditions. Finally, complexity quantifies the cognitive load required to understand an explanation, considering factors such as the number of features highlighted and the intricacy of the explanatory logic; lower complexity is generally preferred for usability and interpretability. By evaluating these three dimensions, the framework provides a holistic assessment of explanation quality beyond single metrics, acknowledging that a high-quality explanation must be both accurate, stable, and understandable.

The scoring framework’s evaluation encompassed over 100 distinct XAI algorithms, ensuring broad coverage of the current XAI landscape. This analysis was informed by a systematic review of 30 comprehensive surveys of XAI techniques, published between 2017 and 2023. The selected surveys were identified through searches of academic databases including IEEE Xplore, ACM Digital Library, and arXiv, prioritizing those providing detailed taxonomies or comparative analyses of XAI methods. Algorithms included in the assessment represent a range of techniques, including model-agnostic methods such as LIME and SHAP, as well as model-specific approaches applicable to deep neural networks and tree-based models. This broad scope aims to provide a representative evaluation of XAI performance across diverse algorithmic implementations.

Sensitivity analysis of the Compliance Score revealed a maximum variation of ±20% when legal strength factors – representing the weight assigned to specific regulatory requirements within the scoring framework – were altered. This indicates a substantial degree of robustness in the generated scores; despite changes in the relative importance of individual legal provisions, the overall compliance assessment remained largely consistent. The analysis involved systematically adjusting the weighting of factors derived from the AI Act, GDPR, and MDR, and observing the resulting impact on calculated Compliance Scores across the analyzed set of over 100 XAI algorithms. This level of stability suggests the framework provides a reliable and consistent evaluation, even with potential shifts in the interpretation or prioritization of legal requirements.

A standardized evaluation process for Explainable AI (XAI) methods is crucial for responsible AI deployment and regulatory adherence. The proposed framework offers a consistent methodology for assessing XAI techniques against legal requirements such as the AI Act, GDPR, and MDR, mitigating ambiguity in compliance assessments. This standardization reduces the risk of subjective interpretation during audits and facilitates clearer documentation for regulatory bodies. By providing a quantifiable Compliance Score, organizations can demonstrably prove alignment with legal standards, streamline the certification process, and proactively address potential legal challenges associated with AI system deployments. Furthermore, a consistent evaluation approach allows for comparative analysis of different XAI methods, enabling informed decisions regarding the selection of techniques best suited for specific applications and risk profiles.

The pursuit of quantifiable XAI compliance, as detailed in this assessment framework, inherently acknowledges the transient nature of any system attempting to meet static regulatory demands. Like all complex creations, both the AI models and the interpretability techniques employed are subject to the inevitable accrual of ‘technical debt’ in the face of evolving legal landscapes. As John von Neumann observed, “The best way to predict the future is to invent it.” This rings true; rather than passively awaiting changes in the EU AI Act, the framework proactively addresses the challenge of aligning dynamic AI systems with fixed legal requirements – a necessary act of invention to ensure graceful aging within a complex regulatory environment. The framework doesn’t merely test; it actively shapes the future of compliant AI.

What Lies Ahead?

The mapping of legal statute to algorithmic property, as undertaken in this work, reveals less a solution and more a persistent translation problem. Every commit is a record in the annals, and every version a chapter-the EU AI Act, like any complex system, will accrue amendments, interpretations, and unforeseen edge cases. A compliance scoring, however precise at this juncture, is therefore a snapshot, not a guarantee. The temptation to treat it as such-to optimize for a static regulatory landscape-is a tax on ambition.

Future iterations of this framework must account for the decay inherent in both law and technique. Robustness, a key metric assessed herein, isn’t simply resistance to adversarial attack, but resilience to legal drift. Moreover, the very notion of ‘faithfulness’ – aligning explanation with model reasoning – demands continuous reevaluation as models themselves evolve, becoming ever more opaque. The field should turn toward methods that don’t merely explain decisions, but actively quantify the cost of non-compliance-a preemptive accounting of potential legal exposure.

Ultimately, the pursuit of ‘explainable AI’ isn’t about achieving perfect transparency-an illusion, perhaps-but about building systems that age gracefully within a shifting legal and technological environment. The true challenge lies not in demonstrating compliance today, but in designing for adaptability tomorrow.

Original article: https://arxiv.org/pdf/2604.09628.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Inevitable Demand for Algorithmic Transparency

Deciphering the Core Properties of Explainable Systems

A Comparative Assessment of XAI Methodologies

Towards a Standardized Framework for XAI Compliance

What Lies Ahead?

See also: