FAQ: Logistic Regression II - Class Imbalance

This community-built FAQ covers the “Class Imbalance” exercise from the lesson “Logistic Regression II”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Data Scientist: Machine Learning Specialist

Machine Learning: Logistic Regression

FAQs on the exercise Class Imbalance

There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!
You can also find further discussion and get answers to your questions over in Language Help.

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head to Language Help and Tips and Resources. If you are wanting feedback or inspiration for a project, check out Projects.

Looking for motivation to keep learning? Join our wider discussions in Community

Learn more about how to use this guide.

Found a bug? Report it online, or post in Bug Reporting

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

Rather than just complain on a volunteer forum, you can make course suggestions here:

or, here:
https://codecademyready.typeform.com/to/KLM07uCE?typeform-source=discuss.codecademy.com

Be as descriptive as possible when reporting/suggesting.

Apologies and I’ve deleted my original comment, I wasn’t aware this was a volunteer run forum. But after a long day, coming onto Codecademy and trying to navigate this lesson was incredibly frustrating and made me want to vent and quit the whole thing. Thank you for providing the feedback links, which I have gladly used to provide a detailed response.

I just wanted to let any other students who have tried this lesson and visited the forums that they are not alone in their frustrations, and they’re not just stupid - which is exactly how I felt after attempting this lesson.

Yep, I get your frustrations but 4 posts saying essentially the same thing–while probably justified–is a tad bit excessive.

Hopefully CC will take your suggestion(s) into consideration when/if they revise the lesson.

I get the message that my value of accuracy_str is wrong, but i don’t have a clue as to why. Maybe someone can help me out here. I checked with the solution, and the only thing i made different than cc is that i created a new instance of the LogisticRegression() class.
My code is as follows:

3. Model predictions after Stratified sampling

log_reg_str = LogisticRegression(penalty=‘none’, max_iter=1000, fit_intercept=True, tol=0.000001)
log_reg_str.fit(x_train_str, y_train_str)
y_pred_str = log_reg_str.predict(x_test_str)

recall_str = recall_score(y_test_str, y_pred_str)
accuracy_str = accuracy_score(y_test_str, y_pred_str)
print(‘Stratified Sampling: Recall and Accuracy scores’)>
print(recall_str, accuracy_str)

The rest is as in the solution. Also, my stratified recall score is lower than the unstratified one, which should not be like that according to the hint, and what would make the whole idea of stratification rather useless in this scenario.

Thank you in advance
and happy coding
mike

:smile: I am appreciating your posts as I’m going through this. Makes me feel more motivated. This has been a challenging section to grasp for me. Glad to see I’m not alone.

That being said.

Undersampling and Oversampling, wouldn’t it have a negative impact using a sample again. I feel like I just read a lesson saying you wouldn’t want data from the same patient twice.

Same question here. I was expecting stratification to improve the recall score but instead it decreased.

I would appreciate if someone could shed a light on this :slight_smile:

I made the lesson, but i am having difficulty to understand what is really going on, I am glad to know that i am not the only one. I guess that with more reading and practicing I will be able to make it better here, but I would like some external resources to have a better light on it. Thank you