Research

Caveman Mode vs Normal Claude: A Real-World Comparison

2025-03-20 6 min read

We ran 50 real coding tasks through both modes. The results are wild.

We ran 50 real-world coding tasks through both standard Claude and Caveman Claude. Every task was a realistic developer prompt: debugging, refactoring, explaining concepts, writing tests.

Methodology

Each task was sent to both modes in the same session, same context, same model. We measured output token count, correctness (judged manually), and whether all key information was present in the caveman response.

Results

Average token reduction: 73.4%. In zero cases did the caveman response omit information that mattered for the task. In 6 cases the caveman response was actually more useful — because it was direct enough to spot the real issue immediately, without building up to it.

The only category where caveman mode underperformed slightly was conceptual explanations for beginners. For everything else — debugging, coding, reviewing, refactoring — caveman wins.

← Older

How to Install the Caveman Skill in 30 Seconds

Newer →

Why Token Efficiency Matters More Than You Think