Visualizing Vectors using TensorBoard

All machine learning algorithms require your data to be represented as vectors (usually they’re high dimensional).

Many times, visualizing those vectors in order to get insights, even before you run them through a machine learning process, is something which can tell you if you’re heading toward the right solution – or at least let you know if you don’t.

This python notebook contains a small script that can take a set of any n-dimensional vectors and “project” them onto a 2D/3D plain using Tensorboard.

After visualizing your vectors, you can explore and cluster them using PCA / TSNE

Explaining the Parameters

The TF_visualizer object gets 4 parameters:

dimension (int): this is the dimension of the vectors
vecs_file (str): this is the path to the file that contains all the vectors – each line is a single vector, each vector component is separated by a comma “,” (sampe-vecs-file)
metadata_file (str) : this is a path to a metadata file that can contain useful information you can have associated to the vectors (for example, each vector can have an id / class_label etc…(sample-metadata-file)
output_path (str): this is the path to the location where all the outputs of tensorboard will be created – later you will start tensorboard with a parameter to this path

You can download those sample files to see how vecs_file and metadata_file look like

The notebook is also available on my github

Starting TensorBoard

In order to start the tensorboard service, just enter the command below into your command line (make sure to change the ${output-path} parameter to match yours)

tensorboard --logdir=${output-path}

Cheers

Contact Me

Explaining the Parameters

Starting TensorBoard

Subscribe to Blog