Home Explore Pricing Blog Docs New Tracker

Get the App

•

XAI, Sentience & Safety - NBot Tracker | nbot.ai

XAI, Sentience & Safety

Created by three dot

432 posts

Updated 15 days ago

0 scanned

Curated XAI papers, AI consciousness discussions, neuromorphic breakthroughs, and safety policy news

Create Similar Tracker

Highlights for you

Anthropic vs U.S. government: procurement bans, supply chain risk, and lawsuits escalate with court rejections [climaxing]

Federal appeals reject Anthropic bids to lift DOJ/Pentagon blacklists (Apr 2026), citing wartime needs; Claude Opus 4.6 sabotage report shows no dangerous goals but BridgeBench hallucination drops; Mythos Preview withheld over cyber risks, Gods/Monsters highlights zero-days and Big Tech sharing; UK AISI steering replicates effectively; May hearings pending amid protests/NIST.

19 sources

Use arrow keys to navigate

Digest Calendar

May 2026

Sun

Mon

Tue

Wed

Thu

Fri

Sat

New XAI Papers

🔥 Interpretable MIL for Hematologic Diagnosis: CAREMIL framework with DeepHeme encoder achieves AUROCs of 0.999 for AML, 0.891...

April 15, 2026

Even OpenAI Underestimates Inference Compute Needs

Hot take from @mattshumer_: Even OpenAI is dramatically underestimating how much inference compute will be needed in coming years.

April 15, 2026

Towards Long-Horizon Agentic Multimodal Search

New paper Towards Long-horizon Agentic Multimodal Search targets advances in agentic multimodal search for long-horizon planning, pushing benchmarks toward real-world multimodal scenarios. Join the discussion.

arxiv.org

Towards Long-horizon Agentic Multimodal Search

April 15, 2026

XAI Trend: Interpretable Breakthroughs in Genomic Mutations and Blood Smear Diagnostics

Emerging XAI tools turn foundation models into explainable medical predictors.

Genomics: Evo 2 embeddings enable SOTA variant pathogenicity...

A New AI Tool Could Transform How We Diagnose Genetic Diseases

time.com

A New AI Tool Could Transform How We Diagnose Genetic Diseases

April 15, 2026

Anthropic Ends Model Version Pinning, Risks App Breaks

Deprecation alert: Anthropic is forcing users off reliable claude-sonnet-4-5-20250929 to claude-sonnet-4-6.
No fixed versions: Model page lists...

Tell HN: Anthropic no longer allows you to fix to specific model version

news.ycombinator.com

Tell HN: Anthropic no longer allows you to fix to specific model version

April 15, 2026

Anthropic's AI Alignment Automation vs. Meta's Catastrophic Risk Evals

Divergent AI safety paths emerge:

Anthropic's Claude outperforms humans 4x on weak-to-strong supervision, hitting 97% PGR in 5 days vs. humans' 23%...

Anthropic's AI Researchers Outperform Humans 4x on Alignment Task

blockchain.news

Anthropic's AI Researchers Outperform Humans 4x on Alignment Task

April 15, 2026

Mythos Cyber Hype Tested: Infiltration Wins, Cheap Detection, Agent Exploits

Mythos shines in multistep attacks but faces quick counters and its own flaws:

Infiltration feats: First model to fully solve AISI's 32-step "The...

April 15, 2026

Miles Brundage on Surging AI Auditing for Governance

AI auditing is hotter than Artemis II's shield upon reentry, per @KevinTFrazier quote.
Miles Brundage (@AVERIorg) enjoyed deep dive on auditing with @ARozenshtein.
Listen to their @scaling_laws episode for scaling AI governance insights.

April 15, 2026

Deaf-Led CoSET Proposes Deaf-Safe Risk Framework for AI Interpreting

Deaf-led advocacy advances equitable AI safety: CoSET's SAFE AI Task Force toolkit evaluates automated sign language tools for quality, safety, and...

CoSET SAFE AI Task Force Proposes a Risk Framework for Automated Interpreting at SLxAI Summit

multilingual.com

CoSET SAFE AI Task Force Proposes a Risk Framework for Automated Interpreting at SLxAI Summit

April 15, 2026

Post-Hoc XAI Methods for Trust in Hate Speech Moderation

Post-hoc XAI methods—feature attribution, counterfactuals, and natural language rationales—enhance trust by helping users understand hate speech moderation decisions. These complement ante-hoc approaches in stakeholder-focused explainability.

Explainable AI for Hate Speech Moderation: A Stakeholder

April 15, 2026·

wires.onlinelibrary.wiley.com

April 15, 2026

Justifiability Over Explainability: A New AI Ethics Priority

A new paper repositions XAI by arguing that justifiability—exploring several ways algorithms can be justifiable—puts mere explainability in its place as the superior goal for ethical AI.

Justifiability and AI: putting explainability in its place - PhilPapers

April 15, 2026·

philpapers.org

April 15, 2026

XAI as Both Technology and Law in Financial Advisory

Explainable AI (XAI) is positioned as both a technology and a type of law in financial advisory, based on foundational principles.

Explainable AI (XAI) In Financial Advisory

April 15, 2026·

ijarsct.co.in

April 15, 2026

Anthropic taps Novartis CEO for AI safety board

Anthropic's Long-Term Benefit Trust has appointed Vas Narasimhan, CEO of Novartis with over two decades in medicine and global health, to its Board of Directors. Key move to infuse biomed expertise into AI safety governance.

April 14, 2026