Yaron Vazana

NLP, Algorithms, Machine Learning, Data Science, tutorials, tips and more

  • About
  • Blog
  • Projects
  • Medium

Contact Me

yaronv99 [at] gmail.com

Powered by Genesis

You are here: Home / Archives for python

Training an AutoEncoder to Generate Text Embeddings

September 28, 2019 by Yaron Leave a Comment

Calculating sentences / paragraphs vectors can be done in many ways. For example a simple method  is to average all the words vectors and retrieve a single vector for the entire piece of text, of course this forces you to have pre-calculated word embeddings (I already wrote about it here).

In this post I will show a different approach that uses an AutoEncoder

The aim of an AutoEncoder is to learn a representation (encoding) for a set of data, typically for dimensionality reduction, by training the network to ignore signal “noise”.

autoencoder model diagram
[Read more…]

Filed Under: Data Science, Python Tagged With: autoencoder, Data Science, machine learning, python

Using Dockers for your Data Science Dev Environment

May 15, 2019 by Yaron Leave a Comment

This post is a bit different than all other posts I usually publish. In this post I decided to talk about my data science development workflow, and how I utilize dockers in my daily work. I truly believe in great tools which make us more productive and help us focus on our main problem at hand.

I will start with why would we even want to use dockers? and what’s wrong with the usual way data scientists work with notebooks?

[Read more…]

Filed Under: Data Science, Python Tagged With: Data Science, docker, python

Identifying Real Estate Opportunities using Machine Learning

January 26, 2019 by Yaron Leave a Comment

Real Estate investments have always been something I was really interested in. The geographical factors, together with humans’ behavior patterns, have the power to determine whether one place is more “wanted” than another.

My quest began when I decided to utilize Machine Learning techniques in the Real Estate domain, in order to help me find my best “next investment”.

In addition, I was also very curious about regression analysis, since most of my time I’m dealing with classification tasks. So I thought what could be better than doing an EDA on the topic.

real estate using machine learning
real estate together with machine learning
[Read more…]

Filed Under: Algorithms, Data Science, Python Tagged With: Data Science, python, real estate, regression

How to Create a Simple WhatsApp Chatbot in Python using Doc2vec

December 25, 2018 by Yaron Leave a Comment

Almost all of us use whatsapp on a daily basis. Those conversations are basically unstructured text that we can use in order to learn and experiment. In this tutorial I will show how to create a very simple chatbot, that you can chat with, simply by training a doc2vec model using all the messages you already have on you phone.

whatsapp chatbot with python

Disclaimer: This post and implementation is based on the following great post which appeared in toward-data-science

If you’re just interested in the full python notebook, it’s right here (I changed the original names)

At a high level, the steps would include:

  • Loading your whatsapp conversation into a python DataFrame
  • Preparing a training set of (text, response) tuples – so the chatbot will be able to respond to your input
  • Training a Doc2Vec model
  • Implementing the chatbot conversation in python

Let’s start…

[Read more…]

Filed Under: Algorithms, Data Science, Python Tagged With: Data Science, doc2vec, python, word2vec

Training a Doc2Vec Model with Gensim

January 20, 2018 by Yaron 2 Comments

Representing unstructured documents as vectors can be done in many ways. One very common approach is to use the well-known word2vec algorithm, and generalize it to documents level, which is also known as doc2vec.

A great python library to train such doc2vec models, is Gensim. And this is what this tutorial will show.

Training Doc2Vec Model with Gensim

[Read more…]

Filed Under: Algorithms, Data Science, Python Tagged With: Data Science, doc2vec, python, word2vec

SUBSCRIBE TO BLOG

Subscribe to Blog

Subscribe to get the latest posts to your inbox

Recent Posts

  • Training an AutoEncoder to Generate Text Embeddings
  • Using Dockers for your Data Science Dev Environment
  • Identifying Real Estate Opportunities using Machine Learning
  • How to Create a Simple WhatsApp Chatbot in Python using Doc2vec
  • Average Word Vectors – Generate Document / Paragraph / Sentence Embeddings
  • Visualizing Vectors using TensorBoard
  • Training a Doc2Vec Model with Gensim
 

Loading Comments...