Download Paper
The AGI Evaluation Framework

Beyond Narrow Benchmarks

The Central Thesis

AGI is a Human-Level Architectural Challenge.

The term "AGI" is often ambiguous. We define it rigorously, leveraging the established Cattell-Horn-Carroll (CHC) Theory of human intelligence. General intelligence is not monolithic, but a composition of distinct, interdependent abilities.

1. The Ten Pillars of Cognition

The AGI Scorecard (0-100%)

K - General Knowledge

Commonsense, Science, History

RW - Reading & Writing

Decoding, Comprehension

M - Mathematical

Arithmetic, Algebra, Calculus

R - Reasoning

Fluid Intelligence, Deduction

WM - Working Memory

Active Attention (Short-Term)

MS - L-T Memory Storage

Continual Learning

MR - L-T Memory Retrieval

Fluency & Precision

V - Visual Processing

Perception & Reasoning

A - Auditory Processing

Speech & Sound

S - Speed

Reaction Time & Fluency

2. The Critical Deficits

Why Current AI is Not AGI

Current LLMs exhibit a "jagged" cognitive profile, confirming they are still Narrow AI.

Critical Flaw 1: Amnesia in L-T Storage

MS Score: 0%

Problem: Models cannot stably acquire or store new experiential information. They lack plasticity.

Critical Flaw 2: Capability Contortions

Costly "workarounds" creating an illusion of intelligence.

Massive Context Windows

vs. Lack of L-T Memory (MS)

RAG (Retrieval Augmented Gen)

vs. Lack of Retrieval Precision (MR)

3. The "Jagged" AI Profile

Uneven Progress & Critical Bottlenecks

Application of this framework reveals a highly uneven cognitive profile in contemporary models, showing rapid progress but critical bottlenecks.

Model K/RW/M R/WM/MR V/A/S Total AGI Score
GPT-4 (2023) 18% 6% 3% 27%
GPT-5 (2025) 29% 15% 13% 57%