First off, amazing work! Iām new to this course, but your work has really inspired me. You have a great understanding of the code. I donāt see anything functionally wrong with it. My only suggestions concern small typos and light recommendations for improved readability, clarity and output consistency. Sorry if it seems overwhelming, I just like to clarify and give examples. Take it all with a grain of salt!
Having worked in the insurance field, the term āsubjectā seems a bit general. Even the term āinsureeā (the people being covered by insurance) includes children and spouses. The term I would recommend for this study is āpolicyholderā which is the sole person who pays for the insurance, makes changes, and has dependents on the policy (children/spouses).
Cell 89 - Small typo in your #comments explaining the code. You mention āminimalā to describe the code for both the minimum and maximum number of children. Also, the output of this data would be consistent with other cells if it contained a string of text describing it.
Exā¦
print(āThe minimum number of kids a policyholder reported having is ā + ā¦)
print(āThe maximum number of kids a policyholder reported having is ā + ā¦)
Cell 118 - to make the paragraph more readable, perhaps use commas in the dollar values and a space between the values and the āUSDā
āsmokers is 32050USD, nonsmokers is 8434USD.ā
Change to ā¦
āsmokers is 32,050 USD, nonsmokers is 8,434 USD.ā
Cell 126 - In your sentence describing the data you write āFrom our 1338 subjects we 274 subjects are smokers, which is 20.5% of the population.ā I believe you added the word āweā in error.
To make the output consistent with others, perhaps convert the sentence that follows the output data into a string within the print function, while rounding to one decimal place in your round function, if you want to show 20.5%, instead of 20.48%. I would also recommend changing āsubjectsā to āpolicyholdersāā here.
Ex.
print(āFrom our 1,338 policyholdersā + str(smoker_counter)+ " are smokers, which is " + str(round(smoker_counter / len(smoker_status) * 100, 1)) + ā% of the population.ā)
Output- āFrom our 1,338 policyholders, 274 are smokers, which is 20.5% of the population.ā
Cell 123 - ābmi_upto_25ā may leave some readers thinking a bmi value of 25 may be included in that group. And ābmi_from_25ā is not as intuitive as it could be. I think more intuitive definitions would be ābmi_below_25ā and ābmi_25_andaboveā, for example. Printing the output data as a string would make this cell consistent with others.
125 - In the very last sentence you write āthe second group (BMI over 25)ā. This group includes the bmi value of 25, so it would be better to write ā the second group (BMI of 25 and above) " while the first group description ā(BMI below 25)ā matches intuitively with my suggestion earlier (Cell 123). Also the values in the paragraph description of the data , ā13.940USD and 10.280USDā, have decimals instead of commas, which may confuse some people. I would also suggest adding a space between the dollar amount and the āUSDā to make it more readable, and perhaps clarifying that these are averages would send a clearer message.
Ex. āWe can see that the second group (BMI of 25 and above) is paying more for their medical insurance than policyholders from the first group (BMI below 25), averaging around 13,940 USD and 10,280 USD, respectively.ā