Hi lisalisaj thank you for your thoughts, I think these are all good points.
I will try explain the reasons behind every choice/mistake.
When you did your two tailed t-test for GDP & LEABY what made you only select data for the year 2015?
For the t-test I started by using entire time series before realizing that this is what those data points are: time series.
By using entire national trends for the test we are not looking at several sample of the same âpopulationâ, but to sequential snapshots describing how each country evolved in time.
16 years are enough for laws to take effect and changing, at least partially, some social systems of a nation.
My idea was to compare the two groups (high/low GDP) by looking at the same snapshot: the same test may potentially give a different result if referred to another year, why mixing thenâŚ
Do you think that this approach is reasonable?
Maybe I can add some text to explain better this choice.
In a facet grid plot of all countries, didnât Zimbabwe have the greatest change (positive correlation) in life-expectancy?
Yes, I think Zimbabwe curve is different from all the others because of its history. The initial downward trend is unique in this dataset, as well as the drastic improvement starting from 2004-2005.
Regarding linearity it is something I forgot to mention explicitly, even if evident from pair plots: it may worth adding it.
One should avoid using archaic phrases like, âthird world countryâ
I totally agree, using third world countries is a mistake.
My initial draft of the notebook used this term: I already noticed that it might suggest the idea of a hierarchy, and thatâs the reason why I then opted for low_GDP
and high_GDP
as variable names.
They seems more objective as high/low GDP thresholds could be defined just by looking at numbers, without even knowing the name of a country.
Unfortunately I missed these last statements: thank you for spotting them!
I will replace third world countries with something referring to strong/weak economy, since this is the underlying concept of high/low GDP naming rule.
I think one needs to be aware that when saying things like, 'Chile probably records a better life ⌠an improved infrastructureâ it can be interpreted as an assumption
The idea here was to attempt a guess, suggest a connection, with just a quick investigation: I was surprised to discover that Chile has a better life expectancy than USA.
I digged a bit to discover that Chilean healthcare is one of the best in South America, but anyway I was impressed.
I think you are right about objectiveness: I can leave it open, as a possible further development (suggesting cultural differences, eating habits and lifestyle as possible aspects to be investigated), rather than presenting those points as probable root causes.