At codecademy I’ve seen examples of training a machine learning model on data formatted as a Pandas dataframe, and also examples of training with the data formatted as a Scipy csr matrix.
What if I have a Pandas dataframe where some of its features are unstructured data, such as the actual text of a review, while the other features are your typical categorical and numerical features? If I vectorize the unstructured data using scipy, so that it can be processed by a machine learning model, it becomes a csr matrix. Can I make my machine learning model look at both the original features present in the dataset, and at the expanded features of the csr matrix? Or can I only train a model separately one the csr matrix features and the other on the dataframe features?