New tools and studies for interpreting complex AI decisions

Making Sense of Black-Box AI

This cluster showcases a wave of work turning opaque models into explainable systems across domains like healthcare, energy, cybersecurity, and language. Researchers propose new explanation methods (e.g., archetypal analysis, hybrid counterfactual/feature attributions, multi-scale entropy for transformers, Bayesian uncertainty estimation) and apply them to tasks such as early Alzheimer’s and diabetes detection, sepsis tools, and household demand flexibility. Several items examine human and societal dimensions—how clinicians understand SHAP outputs, whether counterfactual metrics align with user needs, how explanations affect trust, and how explanation mechanisms themselves can be attacked. Together, these efforts frame explainable AI as both a technical and cognitive “translation” layer between machine computation and human reasoning, especially critical for high-stakes and foundation-model settings.

Sources (18)

Updated Mar 18, 2026

XAI, Sentience & Safety

New tools and studies for interpreting complex AI decisions

Decoding the “black-box”: explainable artificial intelligence towards ...

XAI-IoT based real-time diabetes prediction using TabTransformer

Randy Goebel presented "A (partial) framework for debugging foundation models" on 11 March 2026

[PDF] ARCHEX: GPU-Accelerated Archetypal Explanation for ...

Explainable artificial intelligence for early Alzheimer's diagnosis using ...

[PDF] Opening the Black Box of Neural Computation from Neural ... - bioRxiv

説明可能AI（XAI）と環世界の翻訳問題：異質な認知システム間で「意味」をつなぐ技術とは

Interpretable Predictability-Based AI Text Detection: A Replication Study

Do Metrics for Counterfactual Explanations Align with User ... - arXiv

GradCFA: A Hybrid Gradient-Based Counterfactual and Feature ...

Reliable uncertainty estimates in deep learning with efficient Metropolis- ...

Multi-Scale Entropy for Transformers: Interpreting Training Dynamics and ...

Explainable AI for predicting household demand flexibility

Interpretability of an FDA-authorized AI/ML sepsis diagnostic tool ... - PMC

Interpretation Attacks: Exploiting How AI Explains and Justifies Decisions

Making AI Make Sense: Explainable AI for Trustworthy Intelligence

Artificial intelligence driven approach for securing backup data and ...

[PDF] AN EXPLAINABLE AI AND ADAPTIVE VISION - Jetir.Org