How to plot mode value in histogram? Got ValueError while using stats.mode in ploting with matplotlib

Import packages

import codecademylib
import numpy as np
import pandas as pd
from scipy import stats

Import matplotlib pyplot

from matplotlib import pyplot as plt

Read in transactions data

greatest_books = pd.read_csv(“top-hundred-books.csv”)

Save transaction times to a separate numpy array

author_ages = greatest_books[‘Ages’]

Calculate the average and median value of the author_ages array

average_age = np.average(author_ages)
median_age = np.median(author_ages)
mode_age = 38 # Here instead of value if I try to use stat.mode(author_ages) got error!

Plot the figure

plt.hist(author_ages, range=(10, 80), bins=14, edgecolor=‘black’)
plt.title(“Author Ages at Publication”)
plt.xlabel(“Publication Age”)
plt.axvline(average_age, color=‘r’, linestyle=‘solid’, linewidth=3, label=“Mean”)
plt.axvline(median_age, color=‘y’, linestyle=‘dotted’, linewidth=3, label=“Median”)
plt.axvline(mode_age, color=‘orange’, linestyle=‘dashed’, linewidth=3, label=“Mode”)

#ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

The explanation provided in the stackoverflow is bit confusing for me to understand.

As newbie from the biological background, It is little hard to understand the concept. Please throw some light on the issue. Thanks in Advance

Be careful with the names you use. stats.mode references a valid function whereas stat is not defined but I’m guessing that’s just a typo when copying to the forums. I’d assume your main error occurs as you’re using something along the lines of mode_age = stats.mode(author_age).

Note that stats.mode returns a tuple (I think it’s technically a namedtuple, or at least it’s implemented in a very similar way) of the modal value (as a numpy array) and a count of how many times that value appears (also an array). If you just want the age you need to make sure you reference just the age.

You can access this value with either in the following way (assuming there is only one modal value)-

mode_tuple = stats.mode(author_ages)
mode_age = mode_tuple[0][0]  # the first tuple index and the first array index or...
mode_age = mode_tuple.mode[0]  # the mode attribute and the first index array

Edit: Just checked the source and it is indeed a namedtuple-

Thanks for your reply. I have tried with the code below and got the following error

import numpy as np
import pandas as pd
from scipy import stats
from matplotlib import pyplot as plt

df = pd.DataFrame([
[1, 'David', 50],
[2, 'Carl', 70],
[3, 'Anjel', 90],
[4, 'Alex', 115]
columns=[ 'ID', 'Age', 'Marks' ])

average_mark = np.average(df.Marks)
median_mark = np.median(df.Marks)
mode_mark = stats.mode(df.Marks)

plt.hist(df.Marks, range=(40, 130), bins=4, edgecolor='black')
plt.title("Students Marks")
plt.axvline(average_mark, color='r', linestyle='solid', linewidth=3, label="Mean")
plt.axvline(median_mark, color='y', linestyle='dotted', linewidth=3, label="Median")
plt.axvline(mode_mark, color='orange', linestyle='dashed', linewidth=3, label="Mode")

File "C:\Users\Gnanakkumaar\AppData\Local\Programs\Python\Python38-32\lib\site-packages\matplotlib\", line 2456, in axvline return gca().axvline(x=x, ymin=ymin, ymax=ymax, **kwargs)
File "C:\Users\Gnanakkumaar\AppData\Local\Programs\Python\Python38-32\lib\site-packages\matplotlib\axes\", line 913, in axvline scalex = (xx < xmin) or (xx > xmax)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
[Finished in 2.2s]

If I ignore mode in the plot, I am not getting the error.

Apologies for perhaps adding more info than you required but the issue remains the same in that the return of stats.mode is not a single value. You’d need to extract the relevant result if you want to use it in the way you do. See previous reply for how to extract just the mode (when there is only one mode, i.e. no array dimensions resulting in multple modes)-

1 Like

Thank you so much! :smiley:

1 Like