Yaron Vazana

NLP, Algorithms, Machine Learning, Data Science, tutorials, tips and more

  • About
  • Blog
  • Projects
  • Medium

Contact Me

yaronv99 [at] gmail.com

Powered by Genesis

You are here: Home / Archives for Algorithms

Identifying Real Estate Opportunities using Machine Learning

January 26, 2019 by Yaron Leave a Comment

Real Estate investments have always been something I was really interested in. The geographical factors, together with humans’ behavior patterns, have the power to determine whether one place is more “wanted” than another.

My quest began when I decided to utilize Machine Learning techniques in the Real Estate domain, in order to help me find my best “next investment”.

In addition, I was also very curious about regression analysis, since most of my time I’m dealing with classification tasks. So I thought what could be better than doing an EDA on the topic.

real estate using machine learning
real estate together with machine learning
[Read more…]

Filed Under: Algorithms, Data Science, Python Tagged With: Data Science, python, real estate, regression

How to Create a Simple WhatsApp Chatbot in Python using Doc2vec

December 25, 2018 by Yaron Leave a Comment

Almost all of us use whatsapp on a daily basis. Those conversations are basically unstructured text that we can use in order to learn and experiment. In this tutorial I will show how to create a very simple chatbot, that you can chat with, simply by training a doc2vec model using all the messages you already have on you phone.

whatsapp chatbot with python

Disclaimer: This post and implementation is based on the following great post which appeared in toward-data-science

If you’re just interested in the full python notebook, it’s right here (I changed the original names)

At a high level, the steps would include:

  • Loading your whatsapp conversation into a python DataFrame
  • Preparing a training set of (text, response) tuples – so the chatbot will be able to respond to your input
  • Training a Doc2Vec model
  • Implementing the chatbot conversation in python

Let’s start…

[Read more…]

Filed Under: Algorithms, Data Science, Python Tagged With: Data Science, doc2vec, python, word2vec

Average Word Vectors – Generate Document / Paragraph / Sentence Embeddings

September 20, 2018 by Yaron Leave a Comment

Using the strength of word vectors and applying it to larger text formats, such as documents, paragraphs or sentences, is a very common technique in many NLP use cases.

Let’s look at the basic scenario where you have multiple sentences (or paragraphs), and you want to compare them with each other. In that case, using fixed length vectors to represent the sentences, gives you the ability to measure the similarity between them, even though each sentence can be of a different length.

In this post, I will show a very common technique to generate new embeddings to sentences / paragraphs / documents, using an existing pre-trained word embeddings, by averaging the word vectors to create a single fixed size embedding vector.

average word vectors
average word vectors

[Read more…]

Filed Under: Algorithms, Data Science, Python

Training a Doc2Vec Model with Gensim

January 20, 2018 by Yaron 2 Comments

Representing unstructured documents as vectors can be done in many ways. One very common approach is to use the well-known word2vec algorithm, and generalize it to documents level, which is also known as doc2vec.

A great python library to train such doc2vec models, is Gensim. And this is what this tutorial will show.

Training Doc2Vec Model with Gensim

[Read more…]

Filed Under: Algorithms, Data Science, Python Tagged With: Data Science, doc2vec, python, word2vec

Scala Website Crawler

April 13, 2017 by Yaron Leave a Comment

All Machine Learning models require a large amount of data, both for training and for testing.

Getting the data, even before dealing with the ML stuff, can be hard and tedious, depending on the source you have and the accessibility of the data itself.

Crawling a website or a blog is a convenient way for getting the data you need. With a relatively small effort, you can generate a CSV file containing all the data, and analyze it easier using state of the art tools.

Scala Crawler

[Read more…]

Filed Under: Algorithms, Scala Tagged With: Crawler, Data Science, Scala

SUBSCRIBE TO BLOG

Subscribe to Blog

Subscribe to get the latest posts to your inbox

Recent Posts

  • Training an AutoEncoder to Generate Text Embeddings
  • Using Dockers for your Data Science Dev Environment
  • Identifying Real Estate Opportunities using Machine Learning
  • How to Create a Simple WhatsApp Chatbot in Python using Doc2vec
  • Average Word Vectors – Generate Document / Paragraph / Sentence Embeddings
  • Visualizing Vectors using TensorBoard
  • Training a Doc2Vec Model with Gensim
 

Loading Comments...