Orion Constellation Project - Python Data Visualization

Project Github Link
https://github.com/grantblackbean/orion-data-viz-project

Stuff I did

  • I learned how to create a stylesheet for my matplotlib charts.
  • I learned to colour my plot points based on both 2d and 3d scatter charts based on a set of values (z index in this case)
  • I learned to change the background colour of the subplot to look more like a night sky

Stuff I struggled with

  • I really wanted to increase the size of the points on the 3D chart. I tried using the ‘s=’ attribute to set a size, but got the error ‘scatter() got multiple values for argument ‘s’’. I tried to fix this error in a variety of ways but got stumped.
  • It took me FOREVER to get my local Python install to work. I still don’t understand what the initial issue was (I installed Conda per instructions) and had to manually clear my Python 3 install, remove everything, and manually install again with homebrew (Mac OS X). Feels bad, man.

Please let me know any thoughts you have, particularly if you can point me in the right direction re: point sizes on the 3d chart!

1 Like

Hi @granthendricks137692,

Congratulations on finishing up and kudos to you for adding a little more information with a colourscheme for the effective depth, I really like that idea. As an ex-astro student I should point out that z isn’t depth per-se but just distance along a cartesian z-axis relative to Earth. I think using the r value (distance from Earth in light years) from the paper linked in section 2 of the notebook file (unless you felt like calculating it) would be the best option for this plot as that’d be providing you with some information about this constellation that an observer from Earth could appreciate since it appears 2D when you’re looking at the sky anyway.

As for your main query about struggling with the marker size in the 3D plot please see this previous answer which I think covers your query entirely (including that specific error)-


If in doubt it’s not a bad idea to start with the docs (should be linked in that answer).

As a more general comment I’d encourage you to add labels to your plots (especially the axes in this case) as your goal is to provide your viewer with as much information as possible. Ideally they’d never even have to read the text to understand the graph (that’s not always possible but aiming for it is a good shout). I’d also suggest making your colourscheme into a scaled colourbar (link to the docs below) which is much easier for a viewer to interpret-

With that and the labels I think this would come out very nicely. If you’re more interested in the coding then by all means stick with that for now.

I doubt there are many people out there who’ve had easy flawless installs every time. For me at least I went through conda, homebrew, macports and several other routes and it was never easy; a good package manager is an absolute godsend. But, it does get easier and you learn an awful lot about your system just by hunting down errors if you have the time and you’re tenacious enough. Keep at it.

1 Like

Hey thanks a lot for the thorough reply!

I didn’t label the axes as to be honest I didn’t really understand what units I was working with - as you pointed out, I misunderstood ‘z’ as depth. I enjoyed working on styling the scatter charts and will look at the resources you presented so I can upgrade this a bit with proper labels and a proper 3D plot. I suspected I was calling the wrong thing, but I tried it several ways and couldn’t get where I wanted to be.

I work in marketing and am looking to better explain the data I deal with to my customers and coworkers, so getting better at building and styling charts is what I’m most interested in :slight_smile:

1 Like

Ah fair enough. There’s definitely a fine line between adding information to a figure and adding too much information to a figure. I’m definitely in the camp of choosing a single good image over a paragraph of text though, something the eye is drawn towards rather than drifts away from. I think that’s even more important in some cases e.g. when presenting.

The units are a bit odd since they shifted them into metres (where each cm technically corresponds to a light year so each metre would be 100 light years) for a physical model that could be put together e.g. for a school astronomy project.

You could probably get away with calling them x, y, z displacement in either metres (for the physical model) or multiply them by 100 and call them light years (for the actual units).

1 Like

Hey again,

I made some changes based on your advice and I’m happier with how things are looking. Increasing the point size in the 3D chart (and actually having a working z-index) looks much better!

I wondered if I could ask you a stylistic question. When your units across each axis are the same (in this case I multiplied all values by 100 and labeled them as ‘displacement in light years’) you surely shouldn’t label every axis, right? It looks very silly on my 3d plot and kind of silly on my 2d plot.

What in your opinion is best practice in this scenario? Should I label the X axis with units and let the other axes be inferred? Would someone know to infer that all units are the same, or is this adding unneeded ambiguity? I just don’t want to clutter my charts with labels!

1 Like

I’m not sure if there is a strict best practice (or at least it can change based on what you’re doing with it). I think cutting it down to something like displacement (ly or l.y.) which would be the standard accepted abbreviation for light years might be a decent start. A legend with the units for each axis could also work but if every axis has the same unit, then perhaps you could strip it down further by adding it to the title. Since they all use the same units I can’t see why not. Might be worth a wee look into controlling font/text size too. Try a couple of those options and choose what you think is best.

If you wind up doing a lot of 3D plots you might have more joy in something other than matplotlib. It’s not really built with 3d representation in mind and so far as I’m aware it was just added on at a later date. I’d stick with it for the CC course anyway but if it comes up again in the future then it might be worth having a quick check for any alternatives.

1 Like

Thanks again! I’m going to wrap up the Python data visualization course then explore my other options in the data science library, though I can’t imagine 3d graphing is something I’ll do much of.

1 Like