> It ‘just’ needs a state for every possible random color between. You can use s...

> It ‘just’ needs a state for every possible random color between.

You can use skipgrams - prefixes with holes in them.

Sparse Non-negative Matrix Language Model [1] uses them with great success.

The pure n-gram language models would have hard time computing escape weights for such contexts, but mixture of probabilities that is used in SNMLM does not need to do that.

If I may, I've implemented an online per-byte version of SNMLM [2], which allows skipgrams' use. They make performance worse, but they can be used. SNMLM's predictive performance for my implementation is within percents to performance of LSTM on enwik8.

[2] https://github.com/thesz/snmlm-per-byte