Representing unstructured documents as vectors can be done in many ways. One very common approach is to use the well-known word2vec algorithm, and generalize it to documents level, which is also known as doc2vec.
A great python library to train such doc2vec models, is Gensim. And this is what this tutorial will show.