How does the list comprehension in the example code work?

Question

In the context of this exercise, how does this list comprehension in the example work?

[t*element + w*n for element in range(d)]

Answer

The list comprehension provided in the example code returns a list of x values for the bar locations in the graph.

There are 4 variables which will let us do this:
n determines which dataset it is currently for.
t determines the total number of datasets to graph side by side.
d tells us how many bars there are per dataset.
w tells us the width of each individual bar.

If we take the provided values for the first dataset China Data, we get
[2*element + 0.8*1 for element in range(7)]

This essentially means, for each element in range(7), construct a list where each element is
2*element + 0.8

This would give us this list of values,
[0.8, 2.8, 4.8, 6.8, 8.8, 10.8, 12.8]

If we change n to 2 for the second dataset US Data, it gives us the list
[1.6, 3.6, 5.6, 7.6, 9.6, 11.6, 13.6]

These x values will position each pair of bars for each set of data next to each other in a clear way.

4 Likes

Hello, iā€™ve been struggling to understand the logic behind this concept (hope iā€™m not the only one here) as I like to understand how the code works and not so much to remember or copy-paste the formula.

I tried to simplify the code hoping it could help me understand how it works and I think Iā€™ve got it now. I will post it here hoping someone can give me feedback to check if iā€™m doing something wrong (and if iā€™m right to help those that struggle to understand like I did):

store1_x = [X*2 for X in range(6)]

plt.bar(store1_x, sales1)

# Here's the basic formula for the first set of bars (blue ones). 
# As I understand, you want a list from 0 to 5 ( range(6) ) because you need to plot 6 blue bars. 
# X is every element of that list, and it's being multiplied by 2 to separate the blue bars between each other to make room for the orange bars.


store2_x = [X*2 + 1 for X in range(6)]

plt.bar(store2_x, sales2)

# This is the code for the orange bars, it's the same as above, but adding 1. 
# This places the orange bar one space to the right of the preceding blue bar. 
# In the exercise this was originally 0.8, which places the blue and orange bars next to each other.
# A width of 0.8 is actually better to understand the data, but I changed it as it helped me to visualize how the code works.

So basically youā€™re creating two lists of positions:

[0, 2, 4, 6, 8, 10]
for the blue bars

and

[0+1, 2+1, 4+1, 6+1, 8+1, 10+1]
or
[1, 3, 5, 7, 9, 11]
for the orange bars

NOTE: I know this code positions the ticks in the X axis differently than the initial code, but I think this doesnā€™t matter when youā€™re working with string type labels (such as ā€˜monthsā€™, or in this case ā€˜drinksā€™) (?)

I hope someone can tell me if iā€™m understanding this correctly and help others that like to understand how their code works.

Cheers! :beer:

6 Likes

Thatā€™s it!
The code is just a ā€œsimpleā€ way to set the x values for each bar set.

1 Like

This is so much simple and thank you for pointing it out. Good to learn from you

1 Like

I have a question related to this exercise, I tried to plot plt.xlabels & plt.ylabels & plt.titile but it doent work out, anyone here can help with this?

A little more information would be helpful but if itā€™s not directly related to this FAQ you could check the FAQ for the main exercise linked in the original post or consider a new query in #get-help.
Prior to this it would be worthwhile reading this FAQ which would help whoever responded and likely get you a better answer.

Is the bar centered around the x-value, or does it just start the x-coordinate

i meant, is the bar centered around the x-value, or does it just start at the x-coordinate

The position is actually a default argument of the .bar() method, which defaults to ā€˜centerā€™ (as in centred on the x-value) but can be set to ā€˜edgeā€™ instead if that was preferable. See docs on the wrapper function-
https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.bar.html#matplotlib.pyplot.bar

Is it just me to whom the generated list of values seems to be incorrect:
2020-10-03 00_44_48-Graphing in Python_ Matplotlib _ Codecademy
The blue bars should have started at 0,2,4ā€¦but instead starting at 0.8,2.8ā€¦ same goes for orange bars.
Am I missing out something?Can anybody please explain?

I apologise if Iā€™m mistaken (at the minute I canā€™t get the link to this lesson to work) but the original post in this thread suggests values are multiplied by 0.8. If you mix this with bars that are centred/edged on these values then it may be shifted from what you originally expect.

What values have you used for the x-values and did you alter the width of the bars or their alignment? Please see the following link in the docs that note the defaults and consider how that might affect the plot (i.e. align='center' and width = 0.8)-

1 Like

Thanks for the insight. Hereā€™s the code. I tried changing the alignment also -this logic of the comprehension provided by codeacademy seems to be incorrect -isnā€™t it simpler to use a different comprehension as mentioned in one of the comments.

import codecademylib
from matplotlib import pyplot as plt

drinks = ["cappuccino", "latte", "chai", "americano", "mocha", "espresso"]
sales1 =  [91, 76, 56, 66, 52, 27]
sales2 = [65, 82, 36, 68, 38, 40]

#Paste the x_values code here
n = 1  # This is our first dataset (out of 2)
t = 2 # Number of datasets
d = 6 # Number of sets of bars
w = 0.8 # Width of each bar
x_values = [t*element + w*n for element
             in range(d)]
store1_x=x_values

n = 2  # This is our first dataset (out of 2)
t = 2 # Number of datasets
d = 6 # Number of sets of bars
w = 0.8 # Width of each bar
x_values = [t*element + w*n for element
             in range(d)]
store2_x=x_values
plt.bar(store1_x,sales1,align="edge")
plt.bar(store2_x,sales2,align="edge")
plt.show()


image

Iā€™m not sure it makes much difference at the end of the day since the numeric values donā€™t have any specific meaning to the actual dataset. If you have a neat way to get the values that makes sense to you and would also be understood by anyone else who read your code then by all means do that. I think part of the reason for the extra variables is to make the code extensible, e.g. using range(6) is more hassle to change in the future than the length of drinks.

Most of these details should be covered in the course but Iā€™d deal with the code in the following way, namely swapping the integer values for text labels and introducing axis labels, a title and a useful legend-

ax1_bar1 = ax1.bar(store1_x, sales1, label='Store1')
ax1_bar2 = ax1.bar(store2_x, sales2, label='Store2')
ax1.set_xticks([1.2 + x * 2 for x in range(len(drinks))])
ax1.set(xticklabels=drinks, xlabel='Beverage sold',
        ylabel='Sales', legend='on')
ax1.set_title('Comparing dailys sales at MatplotSip locations')

Yes, you are correct.
The numeric values donā€™t have any meaning so probably we can go with whatever we are convenient with. In one of the exercises, we used middleticks list to plot the xticks correctly in the middle of a set of bars that apparently corrected the look of the graph.
Thanks a lot for your guidance.

1 Like

Hereā€™s a quick summary for those that got a little lost with this exercise.

Youā€™re not drawing one graph with two x-values displayed. Youā€™re actually drawing two graphs, each with one x-value, that are layered on top of each other in such a way that the data of the bottom graph shows through the one on top of it.

To achieve this overlay effect, the list comprehension provided generates x-values that are display-offsets that prevent each graph-layer from blocking the one below it, and also give the illusion that the bars are grouped together side-by-side.

I think this lesson is just wrong. I think the n values should be 0 and 1 to create the described lists, starting with 0 and 0.8, and increasing in increments of 2. Am I missing something?

2 Likes

If you have an up to date link to this lesson please add it, the one at the top doesnā€™t function so Iā€™m having to guess how this works.

From what I can gather from previous queries the actual x values donā€™t really have anything to do with the data (for example a value of 8 has no specific meaning). All youā€™re doing by changing values is moving the x-axis around so I donā€™t think it matters but I canā€™t test because I canā€™t find the lesson :laughing:.

Building off an answer I wrote previously it looks like I did the following for some reasonā€¦ not 100% sure why :stuck_out_tongue: but it should add sensible xticks instead of meaningless numbers.

fig, ax1 = plt.subplots(1, 1)
ax1_bar1 = ax1.bar(store1_x, sales1, label='Store1')
ax1_bar2 = ax1.bar(store2_x, sales2, label='Store2')
ax1.set_xticks([1.2 + x * 2 for x in range(len(drinks))])
ax1.set(
    xticklabels=drinks,
    xlabel='Beverage sold',
    ylabel='Sales'
)
ax1.legend()
ax1.set_title('Comparing dailys sales at MatplotSip locations')
fig.show() 
# WARNING: on the cc lesson this might have to be plt.show()

Yeah, its not a big deal. Just moving x values around. Exercise says basically ā€œthis formula will give you lists at 0, 2, etc. and .8, 2.8ā€¦ā€ Formula doesnā€™t, gives lists increasing from .8 and 1.6. I was just curious if I had lost my ability to do basic algebra. Your xtick formula works with the formula they give perfectly since 1.2 is in the middle of .8 and 1.6. Hereā€™s the link:
https://www.codecademy.com/paths/data-science/tracks/dscp-data-visualization/modules/dscp-data-visualization-with-matplotlib/lessons/matplotlib-ii/exercises/side-by-side-bars