Yaron Vazana

NLP, Algorithms, Machine Learning, Data Science, tutorials, tips and more

  • About
  • Blog
  • Projects
  • Medium

Contact Me

yaronv99 [at] gmail.com

Powered by Genesis

You are here: Home / Data Science / Identifying Real Estate Opportunities using Machine Learning

Identifying Real Estate Opportunities using Machine Learning

January 26, 2019 by Yaron Leave a Comment

Real Estate investments have always been something I was really interested in. The geographical factors, together with humans’ behavior patterns, have the power to determine whether one place is more “wanted” than another.

My quest began when I decided to utilize Machine Learning techniques in the Real Estate domain, in order to help me find my best “next investment”.

In addition, I was also very curious about regression analysis, since most of my time I’m dealing with classification tasks. So I thought what could be better than doing an EDA on the topic.

real estate using machine learning
real estate together with machine learning

TL;DR
As always, if you’re just interested in the python notebook, here’s the link

Before you continue reading, I will say that all the cool / interesting stuff is in the notebook itself

The Intuition

In order to identify real estate opportunities, I used the algorithm described in this great arxiv paper.

In general, the idea is to train a model that can predict house prices (in a certain district). Then, predict the prices of all the listings which are “for-sell”, and identify those with a low price-tag but high market value.

finding the best opportunity

Getting The Data

In this analysis I used data taken from Madlan (which is the Israeli equivalent to the well known Zillow Real-Estate website).

The Data consists of 288 of “sold” records – which will be used as the training set, and another 286 “for-sell” items – this will be the set where we will find the opportunities.

The Workflow

Since this blog post is just a companion summary to the python notebook, I’m only going to outline the main steps I covered in the code:

  • Data Cleaning & Feature Engineering
    • Remove unneeded columns
    • Handle missing values
    • Handle date features
    • Generate more useful features
    • Features scaling and normalization
    • Univariate analysis (features histograms)
    • Bivariate analysis (scatter plots of features with the target variable)
    • 1-hot encode categorical features
  • Modeling
    • Linear Regression
    • Ridge Regression
    • Lasso Regression
    • Random Forest
    • SVM Regressor (SVR)
    • Deep Neural Network
  • Predicting the best real estate opportunities

Link to the notebook on Github

Cheers

Subscribe to Blog

Subscribe to get the latest posts to your inbox

Filed Under: Algorithms, Data Science, Python Tagged With: Data Science, python, real estate, regression

I am a data science team lead at Darrow and NLP enthusiastic. My interests range from machine learning modeling to solving challenging data related problems. I believe sharing ideas is where we all become better in what we do. If you’d like to get in touch, feel free to say hello through any of the social platforms. More About Yaron…

SUBSCRIBE TO BLOG

Subscribe to Blog

Subscribe to get the latest posts to your inbox

Recent Posts

  • Training an AutoEncoder to Generate Text Embeddings
  • Using Dockers for your Data Science Dev Environment
  • Identifying Real Estate Opportunities using Machine Learning
  • How to Create a Simple WhatsApp Chatbot in Python using Doc2vec
  • Average Word Vectors – Generate Document / Paragraph / Sentence Embeddings
  • Visualizing Vectors using TensorBoard
  • Training a Doc2Vec Model with Gensim
 

Loading Comments...