You're right to be skeptical. Without a way to actually implement how the human ...

loehnsberg · 2026-04-11T09:17:30 1775899050

I think if we want to build on what we have, instead of compaction at the end of the context window, the LLM would have to 'sleep', i.e. adjust its weights, then wake up with the last bits of the old context window in the new one, and have a 'feel' for what it did before through the change in weights. I just sense it's not that simple to get there, because simply updating the weights based on a single context sample risks degrading the weights of the whole network.

I like the idea of using small local model (or several) for tackling this problem, like low rank adaptation, but with current tech, I still have to piece this together or the small local models will forget old memories.

ambewas · 2026-04-11T19:03:12 1775934192

Sleep would probably be a part of the equation for consolidating , but there's still the question of how exactly does the brain process the information during sleep in a way that it permanently consolidates the information.

It's not how an llm can work right now, it needs too much iterations & a much bigger dataset than what we can work with. A single time experiencing something and we can remember it. That's orders of magnitude more efficient than an LLM right now can achieve.

SeriousM · 2026-04-11T10:05:14 1775901914

Couldn't fitting solve the problem? That's what companies do: take a model as a base and train it on the specific data long enough so that it prefers the new data. Overfitting may be a thing but for personal use, I may want to have it work as I expected, every time.

airstrike · 2026-04-11T13:50:15 1775915415

> I think this is probably the most interesting field of research right now. Actually understanding in depth how the brain learns, and figuring out a way to build a model that implements this.

This field of research has been around for decades, so who's to say when there'll be a breakthrough.

In fact, LLMs are great despite our very limited understanding, and not because we had some breakthrough about the human brain.

ambewas · 2026-04-11T19:06:22 1775934382

Exactly. It's been around so long and we still don't know how to mimic it.

The way an llm learns is a very interesting way of doing it, but it sure isn't what the brain is doing.

But it's indisputable.. We can get enormous results with this technique. It's just probably not the way forward for faster learning to remediate the issue of context loss.

catlifeonmars · 2026-04-11T17:02:27 1775926947

Why does a language model have to be monolithic? I think retraining a model is expensive (relatively speaking). Is there some way to bolt on specialization?

ambewas · 2026-04-11T19:04:25 1775934265

That's exactly the issue. Retraining is too expensive & needs too much iteration to work efficiently I think.

Leynos · 2026-04-11T23:34:34 1775950474

How well do LoRAs work for this using something like Thinking Machine's Tinker?

saltcured · 2026-04-12T18:49:38 1776019778

It's kind of fascinating that everyone is trying to build a Chinese Room agent with stateless models, since we don't know how to produce a stateful model with continuous, incremental training.

It's like spontaneous implemention of thought experiments from yesteryear. I wonder if all this product-focused experimentation will accidentally impact philosophy of mind after all...