It’s very tricky for me. I have learned a lot from this project, tools for preprocessing data, using pipeline, tuning hyperparameters. Yet, I am not quite successful with my model. I have divided this project into two parts as following:
Part 1 - Predict zodiac signs
My notebook for part 1 - predict zodiac signs
Part 2 - Match couples
I have created a function to find top 5 matches. I used one of record in the data as a sample and try to find the distances between this sample and all the data points. The drawback is that the function cannot find the matches if some features used in the function are null values. Here is my repo:
My notebook part 2 - find top 5 matches
Any suggestion is welcome!