Hi Chang,

Glad you have found a description of what it has been done in the readme file!

With regard the issue related to the autocorrelation plot, the lag on the x-axis is related to the number of obs and not at the years! Hoping this can give you a more insight into the plot.

Ok… I’m pretty sure autocorrelation plot is about checking randomness in a time series by computing autocorrelations for data values at varying time lags. So it probably doesn’t make sense if you are not using time lags.

Hi, I think the plots and nice and neat. One suggestion I have is to perhaps add transparency (alpha) for those plots that are on top of each other, for example life expectancy vs count. I think that will allow the viewers to see the whole distribution. Try it and see if it makes much difference.

Dear Chang,

I agree with you that a time series autocorrelation is used to measure the current values of a variable against the historical data of that variable. However, the lag in the x value is related to the number of observations enclosed in the dataset. In my plot, it has been used the default lag=1.

On my view, this attached link could give you an insight about the autocorrelation in time series.

https://www.influxdata.com/blog/autocorrelation-in-time-series-data/

Hoping this could help you.

Thanks for the feedback

the graphs that you are mentioning are stacked on top of each other so you can already see all the distribution, if I make them transparent, it’s not gonna change much because they stacked, not one behind another, I hope this makes sense, also the graph before it shows it, that’s why I used these 2

Hi,

I don’t think you’ve understood what I meant. Your plot is not really an autocorrelation plot, but it’s actually a partial autocorrelation plot. Partial autocorrelation plot shows the correlation between the current value and the past value. So if 2015 is current (lag0), then lag1 = 2014, lag2 = 2013, and so on. So in our time series, since we only have data from 2000 to 2015, the maximum lag you can have is lag15 = 2000. In your plot, you have lags going up to more than 80. Autocorrelation plot on the other hand is a scatter plot between current values and lagged values.

I understand that you combined all the values for every country into a single series and you end up with n = 96. But that is not a time series anymore, it’s just a collection of data. You have x = ob 1, ob 2, ob 3,… and y = val 1, val 2, val 3, …, but those observations don’t represent time. In a time series, x has to be time, in our case, year. So it makes more sense to have 6 partial autocorrelation plot for each country rather than combining them all.

This tutorial is helpful for understanding what I mean:

Thanks for your further explanation and link to get a view of what you mean. It is quite interesting. Most important, I agree with you that it will be more valuable to show up the plot of autocorrelation of each country.

Hi everyone, i´m just finish my first portfolio and i´d like share with you. This portfolio was entertaining and I was able to apply a lot of the knowledge acquired during the data analytics path. I am open to any constructive criticism!

Hi all, kindly find my version of this project in this link:

https://github.com/pilgrim-65/Codecademy_projects/blob/main/life_expectancy_gdp_JML.ipynb

Hi, nice work overall well done!

Just a small correction - I’ve noticed that you used R2 (coefficient of determination) to show the correlation. Technically, you want to use R instead of R2 to show the correlation between the data. R2 ranges from 0 to 1, and is a measure of how well a regression model fits the data (0 = no fit and 1 = perfect fit). R, or Pearson correlation coefficient ranges from -1 to 1, with -1 = strong negative correlation, 1 = strong positive correlation and 0 = no correlation. R2 is basically R * R, so it removes the negative component of R.

Thank you for your feedback. I intended to calculate R2 as the **absolute goodness** of a linear fit between the two variables considered: GDP and life expectancy. It turned out to be pretty pretty close to 1, showing the strong correlation between the two series of data.

Hello everyone,

this was a nice little project, that I did during the holidays (around 12 hours). I would appreciate it, if someone would check my solution and give me some feedback.

Take care folks!

Hello,

Please see below for my completed project on Life Expectancy and GDP. This was a good experience in navigating through many different plots. The project took me 1.5 days to complete.

Great Project! This helped me learn about tidying up graphs (I used a log scale for several plots) and improving readability.

Would love some feedback.

awesome stuff. In[29] has an incorrect x-label: #Visualize GDP

plt.figure(figsize = (10, 8))

sns.barplot(x = ‘Country’, y = ‘GDP’, data = data_mean)

**plt.xlabel(‘Life expectancy at birth (in years)’)**

#this should be “Country” not Life expectancy

plt.title(‘Average GDP of Six Countries’, fontsize = 16)

plt.show()

plt.clf()

Hi again Shiraen, another error i see in In[31]:

```
#Create separate line plot for each countries
avg_life_expectancy = sns.FacetGrid(data, col="Country", col_wrap=3,
hue = "Country", sharey = False)
avg_life_expectancy = (avg_life_expectancy.map(sns.lineplot,"Year","GDP").add_legend().set_axis_labels("Year","GDP in Trillions (USD)"))
avg_life_expectancy.fig.suptitle('GDP from Year 2000-2015', fontsize = 16)
avg_life_expectancy.fig.subplots_adjust(top = 0.9)
plt.show()
plt.clf()
```

Do you see the error? You’re talkinga bout avg_life_expectancy but you’re using GDP! you used this one correctly for GDP, but you need to change the GDP input to Life Expectancy input! OTherwise you have two identical data visualizations.

Hi wilddavidwild,

I indeed totally overlooked them. Thank you for the feedback and corrections! it is so much appreciated.

Cheers,

Shiraen

Hey everyone, I found this one challenging and found myself using the solution a lot to find inspiration, but overall It has taught me how to take such a project step by step and working through it systematically.

Hi team, here is my project, feel free to comment and to criticize.

https://github.com/paasxx/Life-Expectancy-GDP/blob/main/life_expectancy_gdp.ipynb

Hey there, this was a really fun project, I still struggle with understanding how to do this off the Codecademy platform and properly setting up my Github Repositories, but here it is!