I suspect it's a matter of context: one or both agents forget that they're suppo...

swax · on May 20, 2024

Opus seems to be much better at that. Probably why it’s so much more expensive. AI companies have to balance costs. I wonder if the public has even seen the most powerful, full fidelity models, or if they are too expensive to run.

nerdponx · on May 21, 2024

Right, but this is also a core limitation in the transformer architecture. You only have very short-term memory (context) and very long-term memory (fixed parameters). Real minds have a lot more flexibility in how they store and connect pieces of information. I suspect that further progress towards something AGI-like might require more "layers" of knowledge than just those two.

When I read a book, for example, I do not keep all of it in my short-term working memory, but I also don't entirely forget what I read at the beginning by the time I get to the end: it's something in between. More layered forms of memory would probably allow us to return to smaller context windows.

swax · on May 22, 2024

I mean we have contexts now so large it dwarfs human short term memory right?

And then in terms of reading a book, a model's training could be updated with the book, right?