Roller Coaster pandas/matplotlib Coding Challenge: Value Error

[https://www.codecademy.com/practice/projects/roller-coaster](Project Link)

While writing the function for the histogram for this code challenge, I got the following error:

File "/usr/local/lib/python3.6/dist-packages/numpy/lib/histograms.py", line 795, in histogram
    bin_edges, uniform_bins = _get_bin_edges(a, bins, range, weights)
  File "/usr/local/lib/python3.6/dist-packages/numpy/lib/histograms.py", line 429, in _get_bin_edges
    first_edge, last_edge = _get_outer_edges(a, range)
  File "/usr/local/lib/python3.6/dist-packages/numpy/lib/histograms.py", line 316, in _get_outer_edges
    'max must be larger than min in range parameter.')
ValueError: max must be larger than min in range parameter.

The function code and call are as follows:

def plot_histogram(column, dataframe):
    dataframe.dropna()            
    if dataframe[column].dtype != 'float64':
        print("We can't plot a histogram on that column. Try a numerical column.")
    else:
        plt.hist(dataframe[column])
        plt.legend([column.strip('').replace('_', ' ').title()])

plot_histogram('speed', stats)

When I run this in a jupyter notebook it renders okay without error. Any idea why this is not porting over the the Codecademy code editor correctly?

What code is on those lines 795, 429 & 316?
I think it might be due to there still being NANs in your data.

I found this on SO:
https://stackoverflow.com/questions/42014687/histogram-plotting-attributeerror-max-must-be-larger-than-min-in-range-paramet#42016178

Those line numbers are referencing the numpy library, so I’m not exactly sure.

I figured it was the NaN values causing this, but I am calling dropna() on the dataframe within the function. dropna() will by default drop NaNs, correct? I don’t have to replace them with something else, or define NaNs as na_values in my pd.read_csv() call, do I?

On my jupyter notebook (where I find it easier to do pandas coding), it runs just fine:
Screenshot

Hm. Good question. Maybe the CC LE requires parameters of .dropna() ?
https://medium.com/analytics-vidhya/data-cleaning-and-preparing-functions-in-python-47950bd82f44

I think you might have to use the dropna() function on a specific column and not the entire df.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html

Did you happen to check out other peoples’ code for this project to see what they wrote?

1 Like

[quote=“lisalisaj, post:4, topic:521625”]
Did you happen to check out other peoples’ code for this project to see what they wrote?[/quote]

No,I haven’t yet, but that’s a good idea. I will look tomorrow. morning when I start my session. I have been having other issues with the code editor on this project, namely the graph I made for the top n coasters will not clear out, even after commenting out everything but the pd.read_csv() calls.

Everything looks okay in jupyter though, so I’ll likely just stick with that for this project.

Sometimes it helps to see how others have interpreted the code. It might lead you to something that you’ve not considered and then will investigate. (at least that’s what I do sometimes. It makes me realize there’s so much more to know! hah)

I say stick with jupyter. (b/c doesn’t this project say you can use either?)

I myself havenoticed that commenting out my code in the CC LE doesn’t always work.

Did you try plt.clear() ?
https://stackoverflow.com/questions/8213522/when-to-use-cla-clf-or-close-for-clearing-a-plot-in-matplotlib

https://stackoverflow.com/questions/28757348/how-to-clear-memory-completely-of-all-matplotlib-plots

yes, I commented out everything but the plt.clf() calls and it’s still not clearing the ‘top n coaster’ graph. I assume it’s just a glitch with the code editor. It’s all good though, I’m not going to break my brain over it…it’s the first time it happened and seems like a one time thing.

Thank you for looking into that though. I appreciate your assistance.