Scala Archives - Yaron Vazana

Scala Website Crawler

April 13, 2017 by Yaron Leave a Comment

All Machine Learning models require a large amount of data, both for training and for testing.

Getting the data, even before dealing with the ML stuff, can be hard and tedious, depending on the source you have and the accessibility of the data itself.

Crawling a website or a blog is a convenient way for getting the data you need. With a relatively small effort, you can generate a CSV file containing all the data, and analyze it easier using state of the art tools.

[Read more…]