Micah Stubbs' Weblog

Atom feed for llm-benchmarks

1 post tagged “llm-benchmarks”

2026

Building a functional consciousness eval suite for LLMs

Building a functional consciousness eval suite for LLMs I spent last night at the AGI House Engineering Consciousness Hackathon in San Francisco building an eval harness that tries to answer a question I find genuinely hard to let go of: when an LLM says “I’m uncertain about this,” does anything …