Python Data Visualisation Modules - Seaborn & Matplotlib

Hi team,

I have just finished learning Python Matplotlib and Python Seaborn in Data Science Path. To be honest, there is so much (maybe even too much) information I had to absorb and now, I am trying to take a step back for a bit to reflect and gather my thoughts.
In the process of doing so, I have some general questions (not linked to a specific exercise) that I need a Python Expert to help me demystify them:

  1. Throughout the learning, I noticed that we rarely create a variable and assign it to commands like plt.plot() and many more. Even in the Matplotlib and Seaborn cheatsheets, I don’t see a variable created before plotting a graph. Why is that?
    How is it different than plt.subplot() which we can assign to an ax object like this ax = plt.subplot() ?
    Even if we can create a variable and assign it to these lines of commands, how does it work? Can you please provide a simple example.

  2. In real-work situation, I am sure we use the combination of Matplotlib and Seaborn modules to create data visualisation. Now, is there an order of codes that we have to follow when we start plotting a graph using these two modules?
    For example, can we write plt.title() before plt.plot() or sns.barplot() etc ? Or does the plt.figure() always go first before plt.subplot() ?
    Throughout the learning and exercises, I did try to experiment on this aspect but didn’t find many errors.

  3. In a real-work situation, does it make you a bad Python Programmer if you keep referring to your notes and Python Documentation for Python Syntax if you forgot them? No matter how much I understand and practice, I don’t think Python codes will be able to stick in my memory.

I think those three questions are my concerns at the moment.
I would appreciate the inputs that I can get.

Please avoid putting any unnecessary comments.
Thanks and happy learning,

Jimmy

1 Like

I don’t think anyone is expected to have absolutely everything committed to memory. So, no, no one is a ‘bad programmer’ if they have to look stuff up. :slight_smile: I would also add that part of being a good programmer or data scientist, etc. is knowing where and how to find answers to questions and errors. As long as you have an overall understanding of what you’re doing & trying to accomplish I think that’s what matters.
And, if you’re ever in a situation when you’re doing live-coding or, an assessment, talk through what you’re doing. Talking the process through, matters.

When you experimented, or did an off-platform project, did you get errors in your vizzes?
You can always refer to documentation: http://seaborn.pydata.org/generated/seaborn.barplot.html

selecting the best type of viz for your data is also key. There are some articles about this in the DS path. (I’ve bookmarked them myself just in case I ever get stuck).
You can also experiment with visualizations and your data. I’ve had data and discarded choices of viz b/c once it plotted, I realized that it wasn’t fitting according to the data.

You might find this useful too:
http://seaborn.pydata.org/tutorial/categorical.html#categorical-tutorial

I hope this helps.

1 Like

Hey Lisa,

Thanks very much for the inputs…
I experimented on small datasets or dummy datasets for the sake of understanding the order of the code from both Seaborn and Matplotlib. The more experiment I do, the more confusion I get. I honestly haven’t fully grasped the order of the code so far. Hence, I am stepping back a bit and turning myself over here to ask for some sheds of light haha.

Thanks for sharing those links… I will read them as soon as my headache is over haha

Regards,
Jimmy

2 Likes
  1. If you’ve used classes and python objects to any extent then matplotlib starts to make a great deal more sense. I’d encourage you to get used to using the objects and their various methods as opposed to the wrapper functions (nothing wrong with using them but you still should aim to understand what’s actually happening).
    As for why things are often plotted without holding a reference to them is because it’s not uncommon to simply plot something once and leave it at that. A full reference is mostly useful if more data needs to be added (e.g. multiple figures or animations, live or otherwise) so a lot of times people just don’t bother. Holding a reference to the figure and its axis at the very least though is good practice since there’s a very good chance you’ll use them.

Examples of what you’re creating and how to add it.

fig1 = plt.figure()  # create a figure instance and keep a reference to it
ax1 = fig.add_subplot()  # add an axis subplot to an existing figure instance
plt1 = ax.plot()  # edits a lot at once, main output is a sequence of 2d lines [*line2d] to an existing axis
# also note adaptation of ticks axis scales etc. etc.

If you want to test those sort of things perhaps use an interactive python interpreter, disable interactive plotting if it’s on and see if you can make your graphs using the references and the dir() function to check their attributes. It takes time but it will make more sense in the long run.

  1. I don’t know if there’s a specific order but things should be written so that your code can easily be read. You could create almost an entire graph by using the plt.plot command and specifying data/labels etc. since it has calls to build a figure and adds an axis for you if it can’t find an existing one.
    Perhaps that’s enough. If, however, you already had requirements for your figure (size, resolution etc.) then making and sizing the figure first is generally a better choice than creating and then resizing. More complex requirements e.g. writing several figures with mutliple axes is probably best done ahead of time.

  2. I’d have thought the exact opposite. Being capable of working without constant checks is good but it takes practice, keep at it. You’r not doing your employer or even yourself any favours by guessing the outcome of a code.

1 Like

I meant as long as one has an overall understanding of the code, methods, etc. and what you’re trying to do, if you forget a parens or have a syntax issue, that’s easily looked up.

I think your best bet is to continue to practice on datasets. One learns best through repetition (and reading documentation) & applying what you’ve learned.
(This is what I do.)
Are you going to get errors? Probably. But, that’s how you will learn.

Aye sorry I wasn’t arguing with any point you made. I was responding to the original one about being a bad programmer if you look up the docs whilst working. That sounds sensible to me; even basic syntax sometimes goes walkabouts especially if you keep swapping languages. You’re right that there has to be a working knowledge of the language before then but I’d assume :crossed_fingers: that was sorted between you and your employer prior to active work.

1 Like

Colab has an option in their settings to trigger code completions. I have that switched off mostly b/c I find it annoying plus, I like to test my brain and make sure I get the write thing written out.

1 Like

Hey Tiger and Lisa,

Sorry for the late reply (was asleep). Thanks heaps for the answers. I will definitely keep practicing and hope it will make sense (even just a bit).

Also the reason I ask question number 3, I am just concerned how potential hiring manager will think if I have to keep referring to notes to do their future projects, especially if its a time-sensitive project.

Nevertheless, I will try my best…

Regards,
Jimmy

1 Like

If you continue to practice enough things will become ingrained in your brain and you will need need notes less and less. :slight_smile:

Good luck and happy coding!

1 Like