Claude's 'Extended Thinking' feature helps AI reflect more deeply before solving complex problems, but we must understand that the thought process we see as an output may be a summarized version rather than the full logical framework.
Imagine this: What if you had an ‘AI assistant’ that could struggle and reflect on difficult math problems or complex planning documents for 10 times longer than usual? Recently, a technology in the AI industry that allows AI to have a ‘moment of reflection,’ as if it were a human, before giving an immediate answer has become a major topic of conversation. Anthropic, the developer of Claude, calls this ‘Extended Thinking.’
However, doubts have recently been raised about whether the ‘thought process’ shown by this technology is truly all the traces of the AI’s contemplation. Can we really trust the AI’s thought process that we see on the screen 100%?
Why is this important?
As AI technology advances, we want to know the ‘reason’ why AI reached a certain conclusion. Transparency in an AI’s thought process (an audit trail, a logical record that can be audited) is particularly important for critical tasks like writing complex development code or strategic planning, as it helps reduce errors.
If the thought process we see is a ‘summary’ that only contains a part of the overall logic, there is a risk that the user will not be able to fully grasp the entire context in which the AI made its decision. This is a very important issue because it could prevent users from discovering the AI’s logical flaws and lead them to accept incorrect information as fact.
Easy to Understand: AI’s ‘Thought Notebook’
To understand ‘Extended Thinking,’ let’s use a metaphor. Imagine that when you are solving exam questions, you are doodling in a ‘scratchpad’ next to your exam paper while solving the problems.
- Conventional Method: A method where the AI writes the answer immediately upon receiving a question without any scratchpad.
- Extended Thinking: It is like instructing the AI, “Think enough in the scratchpad before writing the answer, and show me the process.” Reference 3, Reference 10
The important point here is that this feature does not change into a ‘different, smarter AI.’ It is simply having the existing AI take more time to reflect on its own. Reference 5
However, there is a problem. Latest models like Claude 4 do not show us exactly what is ‘written in the scratchpad.’ Instead, they show us a ‘summary’ that pulls out and organizes only the core points from what they contemplated. Reference 6 Developer Patrick McCanna pointed out that this is not a perfect audit record of AI logic, but merely a ‘summary’ where data loss occurs. Reference 2, Reference 11
Current Situation: Not a Silver Bullet
‘Extended Thinking’ is not always good. Just because an AI thinks more, it does not mean it will provide better answers for every problem. According to research, there are reports that performance can actually drop by up to 36% in certain types of tasks when this feature is used. Reference 3
In some current models, this feature is always on and cannot be turned off. Reference 1 In other words, we are forced to look at the ‘summary of the scratchpad’ written by the AI.
What will happen in the future?
Ensuring the reliability of the ‘thought notebook’ produced by AI will be a technical challenge going forward. Currently, it is technically very difficult to see the process of AI contemplation exactly as it is. This is because the opinion that “no one fully understands exactly how an LLM (Large Language Model, an AI that learns from vast amounts of data to understand and generate language like a human) thinks” is dominant. Reference 11
Therefore, it is wise for users to understand the thought process shown by AI as a tool to refer to the ‘core logical flow’ used by the AI to reach its conclusion, rather than believing that it is ‘everything.’
MindTickleBytes’ AI Reporter’s Perspective
As technology advances, AI is becoming better at pretending to think (Reasoning) like a human. However, what we must not forget is that the AI’s ‘thought process’ is different from a thesis or diary written by a human. Simply put, AI output is closer to precisely calculated predicted values than to absolute truth. Therefore, we must continue to maintain the habit of questioning and verifying the basis of the results produced by AI.
References
- Building with extended thinking - Claude API Docs
- Claude Code Extended Thinking Summary Not Authentic Reasoning …
- Claude Extended Thinking: The Ultimate Guide · GitHub
- Extended Thinking in Claude Code: Unlock Deeper Reasoning
- Claude’s extended thinking - Anthropic
-
[Building with Claude Extended Thinking by Cobus Greyling …](https://cobusgreyling.medium.com/building-with-claude-extended-thinking-d1a8b3130834) - Claude Extended Thinking: When to Use It and How to Build …
- Getting the Most from Claude Code’s Extended Thinking Mode …
-
[Extended thinking Claude Cookbook](https://platform.claude.com/cookbook/extended-thinking-extended-thinking) - Lesson 23: Extended Thinking - Mastering Claude
-
[ClaudeCode’s”extendedthinking”isasummary… HackerNews](https://news.ycombinator.com/item?id=48630535) - Claude3.7 Sonnet debuts with “extendedthinking” to… - Ars Technica
-
[What’sNew inClaudev4? AI Just Got Smarter by Rendiero Medium](https://medium.com/h7w/whats-new-in-claude-v4-ai-just-got-smarter-b62242ad95ba) - HackerNews– Telegram
-
[ThinkingMachines: When Should You Actually Use Reasoning… Glasp](https://glasp.co/articles/when-to-use-reasoning-models) - Claude3.7 Sonnet andClaudeCode\ Anthropic
- A feature that makes AI intelligence infinite
- A feature that allows the model to spend more time and effort reflecting before solving complex problems
- A feature that thinks while disconnected from the internet
- The original version showing every step of the AI's thought without omission
- A summarized version that compresses the AI's reasoning process and includes only the essentials
- Statistical figures about the result
- Yes, performance always gets better
- No, in some tasks, performance can actually decrease by up to 36%
- It is completely unrelated to performance