XAI, Sentience & Safety

New tools and studies for interpreting complex AI decisions

New tools and studies for interpreting complex AI decisions

Making Sense of Black-Box AI

This cluster showcases a wave of work turning opaque models into explainable systems across domains like healthcare, energy, cybersecurity, and language. Researchers propose new explanation methods (e.g., archetypal analysis, hybrid counterfactual/feature attributions, multi-scale entropy for transformers, Bayesian uncertainty estimation) and apply them to tasks such as early Alzheimer’s and diabetes detection, sepsis tools, and household demand flexibility. Several items examine human and societal dimensions—how clinicians understand SHAP outputs, whether counterfactual metrics align with user needs, how explanations affect trust, and how explanation mechanisms themselves can be attacked. Together, these efforts frame explainable AI as both a technical and cognitive “translation” layer between machine computation and human reasoning, especially critical for high-stakes and foundation-model settings.

Sources (18)
Updated Mar 18, 2026