Markov chaining is a [google says] “is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event.”
One way I explain it is by stating I’ve thrown loads of fairy tales into a system, then say “Once upon a” and ask you to predict what the next word will be. You’ll probably say “time”. You’re guessing the next word based on the previous ones and your experience with fairy tales. We then move on to “upon a time”, next word is probably “there” at which point the sentence may continue with either “was” or “were”, as in “Once upon a time there was a…” or “Once upon a time there were…”, at this point we’ve got a split and there’s nothing more ahead except forks and probability.
Each time you give it three words, it looks up all the following possible words it’s seen and then randomly picks one based on how often they all show up. So you have a new set of three words, two of the previous ones and your new one, repeat forever.
Where Markov chains break down is that it can only give you the next-words that it’s seen. Meanwhile, recent Neural Networks and AI learning systems have jumped to “understanding” the connection between words and strings them together in a much more coherent fashion.