My Project - Song Recommendation Project

The project goal was to make a recommendation software based on a set of data and some user inputs about what the user is interested in.

The first project objective was to store data in a data structure. The project followed some units on advanced data structures, such as linked lists, heaps, hash tables, trees and graphs. Also, the example project took some basic lists and turned them into linked lists (which, let’s be honest, did not seem helpful at all to the program), so it seemed implied that an advanced data structure was being asked for. I wound up upgrading a dictionary into a data tree. It was interesting to think about which data structures could be useful for the program tasks.

The song data was based on a csv file containing a Rolling Stone’s top 500 songs list. I thought that if the data was roughly evenly distributed over the representative genres, this would be pretty manageable. Unfortunately for my implementation, this was not the case (believe it or not, Rolling Stone is incredibly biased towards classic rock). Nevertheless, I thought I made it work. Maybe if I had it all to do again I’d go back and cut some classic rock favorites for another genre.

The next objective was to use an algorithm to sort or search for data within a data structure. My implementation didn’t really require a search (you don’t need to look for things if you already know where you put them, if that makes sense). But, for presentation purposes, it was useful to have a quicksort, so I went with that.

The other project goals were the standard portfolio requirements. Use Git version control, use the command line and file navigation, and write a technical blog post on the project.

Git repo for the project is here:

Let me know if you have any comments or feedback. Cheers!

1 Like

So, after going through some other project posts here and re-reading the project description, I realized that user inputs were supposed to be through typing in a few letters like a search. So, I went back and re-wrote the code to do that.

And then, while I was playing around with the data, I realized it was really hard to randomly search into a genre, because the data set I had originally used didn’t have very many genres. So, I wound up deciding to go with a completely different data set based on a compilation of EDM music by Reddit user RayzaZaydan described in this post (if you don’t know, EDM has a subgenre for almost every letter of the alphabet):
https://www.reddit.com/r/EDM/comments/1c9gba8/i_made_a_spreadsheet_compiling_every_edm_song_by/

And then, while I was in there, I thought, I could probably just add a couple of helper functions to the TreeNode class, and then I wouldn’t even need to have an intermediate step with a dictionary. And also when I was looking at outputs from the new data set, it seemed really unnecessary to break things into a separate step for artists and songs, so I completely re-wrote that section of the code.

And that is the story of why the latest version of the code on GitHub looks nothing like what I described in my original post here.

As always, any feedback (on the latest code) is appreciated. Cheers!

1 Like