Speaking of compression, either my present memory or my mind at the time mushed together two different sorts of 1/f scaling laws under the heading of Zipf’s Law, but the overarching principle here is simply “things that vary inversely to frequency”. Generally speaking, keeping track of usage frequencies is part and parcel of building efficient codes.
In it’s first application, then, the environment the Learner had to learn was the usage behavior of its user, as given by finite sequences of characters from a finite alphabet that we might as well call words and by finite sequences of those words that we might as well call phrases or sentences. In other words, the Learner had the job of constructing a user model.
In that frame of mind we are not seeking anything so grand as a Universal Induction Algorithm but simply looking for any approach that gives us a leg up, complexity wise, in Interactive Real Time.
To be continued …