Yaron Vazana

NLP, Algorithms, Machine Learning, Data Science, tutorials, tips and more

  • About
  • Blog
  • Projects
  • Medium

Contact Me

yaronv99 [at] gmail.com

Powered by Genesis

You are here: Home / Archives for doc2vec

How to Create a Simple WhatsApp Chatbot in Python using Doc2vec

December 25, 2018 by Yaron Leave a Comment

Almost all of us use whatsapp on a daily basis. Those conversations are basically unstructured text that we can use in order to learn and experiment. In this tutorial I will show how to create a very simple chatbot, that you can chat with, simply by training a doc2vec model using all the messages you already have on you phone.

whatsapp chatbot with python

Disclaimer: This post and implementation is based on the following great post which appeared in toward-data-science

If you’re just interested in the full python notebook, it’s right here (I changed the original names)

At a high level, the steps would include:

  • Loading your whatsapp conversation into a python DataFrame
  • Preparing a training set of (text, response) tuples – so the chatbot will be able to respond to your input
  • Training a Doc2Vec model
  • Implementing the chatbot conversation in python

Let’s start…

[Read more…]

Filed Under: Algorithms, Data Science, Python Tagged With: Data Science, doc2vec, python, word2vec

Visualizing Vectors using TensorBoard

August 11, 2018 by Yaron Leave a Comment

All machine learning algorithms require your data to be represented as vectors (usually they’re high dimensional).

Many times, visualizing those vectors in order to get insights, even before you run them through a machine learning process, is something which can tell you if you’re heading toward the right solution – or at least let you know if you don’t.

This python notebook contains a small script that can take a set of any n-dimensional vectors and “project” them onto a 2D/3D plain using Tensorboard.

After visualizing your vectors, you can explore and cluster them using PCA / TSNE

clustering using tensorboard

[Read more…]

Filed Under: Data Science, Python Tagged With: Data Science, doc2vec, word2vec

Training a Doc2Vec Model with Gensim

January 20, 2018 by Yaron 2 Comments

Representing unstructured documents as vectors can be done in many ways. One very common approach is to use the well-known word2vec algorithm, and generalize it to documents level, which is also known as doc2vec.

A great python library to train such doc2vec models, is Gensim. And this is what this tutorial will show.

Training Doc2Vec Model with Gensim

[Read more…]

Filed Under: Algorithms, Data Science, Python Tagged With: Data Science, doc2vec, python, word2vec

SUBSCRIBE TO BLOG

Subscribe to Blog

Subscribe to get the latest posts to your inbox

Recent Posts

  • Training an AutoEncoder to Generate Text Embeddings
  • Using Dockers for your Data Science Dev Environment
  • Identifying Real Estate Opportunities using Machine Learning
  • How to Create a Simple WhatsApp Chatbot in Python using Doc2vec
  • Average Word Vectors – Generate Document / Paragraph / Sentence Embeddings
  • Visualizing Vectors using TensorBoard
  • Training a Doc2Vec Model with Gensim
 

Loading Comments...