In the context of this exercise, when do we apply supervised or unsupervised learning?
Supervised learning would apply when the given data has already been labeled and categorized, and we are trying to determine and predict new data points based on this data. Supervised learning usually applies to problems involving predictions based on trends or patterns, which are usually linear.
For instance, supervised learning would apply when trying to predict housing prices based on square footage. We may visualize the data on a 2D graph, where the price increases linearly in relation to the square footage. Following the line of best fit to this graph, we can predict unknown prices.
Unsupervised learning would be applied when the data provided is unlabeled, and we do not know exactly what they are. Unsupervised learning attempts to figure out the inherent structure of the data, by grouping the similar data together, and then categorizing these groups or “clusters”.
For an example, say that we had images of animals, but are not given the labels for what animals are in the images. You might imagine that these are of completely new species. Since we do not know what they are pictures of, we would group together ones that share similar physical characters, then label these groups.