"Also notable: 4.7 now defaults to NOT including a human-readable reasoning token summary in the output, you have to add "display": "summarized" to get that"
I did not follow all of this, but wasn't there something about, that those reasoning tokens did not represent internal reasoning, but rather a rough approximation that can be rather misleading, what the model actual does?
The reasoning is the secret sauce. They don't output that. But to let you have some feedback about what is going on, they pass this reasoning through another model that generates a human friendly summary (that actively destroys the signal, which could be copied by competition).
My assumption is the model no longer actually thinks in tokens, but in internal tensors. This is advantageous because it doesn't have to collapse the decision and can simultaneously propogate many concepts per context position.
I would expect to see a significant wall clock improvement if that was the case - Meta's Coconut paper was ~3x faster than tokenspace chain-of-thought because latents contain a lot more information than individual tokens.
Separately, I think Anthropic are probably the least likely of the big 3 to release a model that uses latent-space reasoning, because it's a clear step down in the ability to audit CoT. There has even been some discussion that they accidentally "exposed" the Mythos CoT to RL [0] - I don't see how you would apply a reward function to latent space reasoning tokens.
There’s also a paper [0] from many well known researchers that serves as a kind of informal agreement not to make the CoT unmonitorable via RL or neuralese. I also don’t think Anthropic researchers would break this “contract”.
> If that's true, then we're following the timeline
Literally just a citation of Meta's Coconut paper[1].
Notice the 2027 folk's contribution to the prediction is that this will have been implemented by "thousands of Agent-2 automated researchers...making major algorithmic advances".
So, considering that the discussion of latent space reasoning dates back to 2022[2] through CoT unfaithfulness, looped transformers, using diffusion for refining latent space thoughts, etc, etc, all published before ai 2027, it seems like to be "following the timeline of ai-2027" we'd actually need to verify that not only was this happening, but that it was implemented by major algorithmic advances made by thousands of automated researchers, otherwise they don't seem to have made a contribution here.
> For example, perhaps models will be trained to think in artificial languages that are more efficient than natural language but difficult for humans to interpret.
The first 500 or so tokens are raw thinking output, then the summarizer kicks in for longer thinking traces. Sometimes longer thinking traces leak through, or the summarizer model (i.e. Claude Haiku) refuses to summarize them and includes a direct quote of the passage which it won't summarize. Summarizer prompt can be viewed [here](https://xcancel.com/lilyofashwood/status/2027812323910353105...), among other places.
Are you sure? It would be great to get official/semi-official validation that thinking is or is not resolved to a token embedding value in the context.
Although it's more likely they are protecting secret sauce in this case, I'm wondering if there is an alternate explanation that LLMs reason better when NOT trying to reason with natural language output tokens but rather implement reasoning further upstream in the transformer.
I would doubt it. They are mostly trained on natural language. They may be getting some visual reasoning capability from multi-modal training on video, but their reasoning doesn't seem to generalize much from one domain to another.
Some future AGI, not LLM based, that learns from it's own experience based on sensory feedback (and has non-symbolic feedback paths) presumably would at least learn some non-symbolic reasoning, however effective that may be.
My argument for this is mostly that we don't use language for all forms of reasoning, and are likely doing so on some internal representations or embeddings. Animals also demonstrate abilities to reason with situations without actually having a language.
I see language more as a protocol for inter-agent communication (including human-human communication) but it contains a lot of inefficiencies and historical baggage and is not necessarily the optimal representation of ideas within a brain.
'Hey Claude, these tokens are utter unrelated bollocks, but obviously we still want to charge the user for them regardless. Please construct a plausible explanation as to why we should still be able to do that.'
I did not follow all of this, but wasn't there something about, that those reasoning tokens did not represent internal reasoning, but rather a rough approximation that can be rather misleading, what the model actual does?