We've already got 1m context, 800k context, and they still start "forgetting" things around the 200k - 300k mark.
What use is 10M context if degradation starts at 200k - 300k?
We've already got 1m context, 800k context, and they still start "forgetting" things around the 200k - 300k mark.
What use is 10M context if degradation starts at 200k - 300k?