I am working my way through the Startup Transformation project and I got to step 11. In there hint they just manualy combine the small categories to the ‘other’ category but in the previous article Discretizing Numerical Data and Collapsing Categories they manage to do it using using a mask variable. I’ve tried doing a similar thing but can’t seem to get it to work. Has anyone been able to group the smaller categories based on a condition?
Do you have any code for that question? What did you write and if you get an error message, what is it exactly?
Here is a comparison between the two.
I think it has something to do with combing the rows in a data frame but I’m not sure if there is a way round it.
I think you might need to change this line, specifically the
mask = expense_overview.isin(proportion[proportion < 0.05].index)
Because, if we add up the other proportions…what’s the left?
I must have a misconception about what the code is doing.
Does that line of code of code take all the Expenses with a proportion < 0.05?
I’ve changed the code a bit and ended up with the same data frame but the rows have been renamed other rather than combining them.