JACOB VILLEGAS

N-grams

N-grams

To build our language model, one of the fundamental pieces are n-grams, which is a continguous sequence of words or tokens for a given sample of text or speach. Here is a small example in building an n-gram (uni, bi-grams) dictionary and using this to classify text to sample in three different languages.

N-grams