This community-built FAQ covers the “Decision Tree Limitations” exercise from the lesson “Decision Trees”.
Paths and Courses
This exercise can be found in the following Codecademy content:
Data Science
Machine Learning
FAQs on the exercise Decision Tree Limitations
There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (
) below.
If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.
Join the Discussion. Help a fellow learner on their journey.
Ask or answer a question about this exercise by clicking reply (
) below!
Agree with a comment or answer? Like (
) to up-vote the contribution!
Need broader help or resources? Head here.
Looking for motivation to keep learning? Join our wider discussions.
Learn more about how to use this guide.
Found a bug? Report it!
Have a question about your account or billing? Reach out to our customer support team!
None of the above? Find out where to ask other questions here!
I’ve read that there are many types of Decision Trees (CHAID, CART, ID3). Which type of tree did we create in this lesson?
1 Like
What do we actually do when pruning the tree?
I think we prune the tree (or limit down the max_depth) in order not to overfit the data as we allow some outlier to exist in our leaves and make them more general.
According to this documentation, it seems that scikit-learn uses an optimised version of the CART algorithm.
1 Like
In the exercise we limit the max depth to 11. However, doesn’t the max depth of the tree equal the count of features? The tree in the example only has a few features.
Having a node splitting the same feature as one of its ancestor nodes doesn’t make sense to me.
Or do i have a wrong understanding about “depth”?
Why do we use a random state value when creating DecisionTreeClassifier? Doesn’t the model choose the feature that gives the highest gain? and it is not a random process
I have the same question as you and I think I have found the right answer:
The theoretical maximum depth a decision tree can achieve is one less than the number of training samples, but no algorithm will let you reach this point for obvious reasons, one big reason being overfitting. Note here that it is the number of training samples and not the number of features because the data can be split on the same feature multiple times.
reference: How to tune a Decision Tree?. Hyperparameter tuning | by Mukesh Mithrakumar | Towards Data Science
1 Like