Hi - thanks for checking out my post regarding a project called Central Tendency in the Master Statistics with Python Skill Path.

I completed the tasks of the project however I am curious about the prewritten code towards the end. Specifically, in the try/except functions for MODE, why does the value I created for manhattan_mode appear to be a list within a list? The value is being accessed as manhattan_mode[0][0] and manhattan_mode[1][0].

When I print manhattan_mode I see that it is an array. However I cannot use head() to check it out, I receive an error: AttributeError: âModeResultâ object has no attribute âheadâ.

I believe it is something with the stats module that Iâm not fully understanding yet.
Is there another function I can use to see what manhattan_mode (and the other mode variables) contain?

Thank you kindly

# Import packages
import numpy as np
import pandas as pd
from scipy import stats
# Read in housing data
brooklyn_one_bed = pd.read_csv('brooklyn-one-bed.csv')
brooklyn_price = brooklyn_one_bed['rent']
manhattan_one_bed = pd.read_csv('manhattan-one-bed.csv')
manhattan_price = manhattan_one_bed['rent']
queens_one_bed = pd.read_csv('queens-one-bed.csv')
queens_price = queens_one_bed['rent']
# Add mean calculations below
brooklyn_mean = np.average(brooklyn_price)
manhattan_mean = np.average(manhattan_price)
queens_mean = np.average(queens_price)
# Add median calculations below
brooklyn_median = np.median(brooklyn_price)
manhattan_median = np.median(manhattan_price)
queens_median = np.median(queens_price)
# Add mode calculations below
brooklyn_mode = stats.mode(brooklyn_price)
manhattan_mode = stats.mode(manhattan_price)
queens_mode = stats.mode(queens_price)
##############################################
##############################################
##############################################
# Mean
try:
print("The mean price in Brooklyn is " + str(round(brooklyn_mean, 2)))
except NameError:
print("The mean price in Brooklyn is not yet defined.")
try:
print("The mean price in Manhattan is " + str(round(manhattan_mean, 2)))
except NameError:
print("The mean in Manhattan is not yet defined.")
try:
print("The mean price in Queens is " + str(round(queens_mean, 2)))
except NameError:
print("The mean price in Queens is not yet defined.")
# Median
try:
print("The median price in Brooklyn is " + str(brooklyn_median))
except NameError:
print("The median price in Brooklyn is not yet defined.")
try:
print("The median price in Manhattan is " + str(manhattan_median))
except NameError:
print("The median price in Manhattan is not yet defined.")
try:
print("The median price in Queens is " + str(queens_median))
except NameError:
print("The median price in Queens is not yet defined.")
#Mode
try:
print("The mode price in Brooklyn is " + str(brooklyn_mode[0][0]) + " and it appears " + str(brooklyn_mode[1][0]) + " times out of " + str(len(brooklyn_price)))
except NameError:
print("The mode price in Brooklyn is not yet defined.")
try:
print("The mode price in Manhattan is " + str(manhattan_mode[0][0]) + " and it appears " + str(manhattan_mode[1][0]) + " times out of " + str(len(manhattan_price)))
except NameError:
print("The mode price in Manhattan is not yet defined.")
try:
print("The mode price in Queens is " + str(queens_mode[0][0]) + " and it appears " + str(queens_mode[1][0]) + " times out of " + str(len(queens_price)))
except NameError:
print("The mode price in Queens is not yet defined.")

link to the lesson, please? CodeBytes doesnât work with this b/c you cannot import Python libraries; it throws errors.

mode is the most frequent value in an array, ie: itâs one value, itâs not a data frame or an array object so you cannot use the Pandas method .head(). Youâre better off printing the manhattan_price, as itâs an array, or column of data.

Also, when I print manhattan_mode, it doesnât appear to be a single value because this is printed: ModeResult(mode=array([3500]), count=array([56])). I guess Iâm not sure exactly what all that means?

In the beginning of the lesson it states:
" In this project, we only care about the price of apartments, so we saved the price of apartments in each borough to:

brooklyn_price

manhattan_price

queens_price

If you want to see what these arrays look like, you can use print statements to see them in the output terminal."

Which means that the arrays are just that single rent column from the data frame. Youâre calculating mean, median, mode on a single column of data only.

*If you do a print(manhattan_price) youâll get this:

0 4500
1 4795
2 4650
3 2950
4 4875
#not all 1476 rows are printed.
1471 3420
1472 2095
1473 4210
1474 3475
1475 4500
Name: rent, Length: 1476, dtype: int64 #column name, length/number of rows, data type of the col.

For manhattan_mode = stats.mode(manhattan_price)

this is returned: âThe mode price in Manhattan is 3500 and it appears 56 times out of 1476â

mode, $3500, is the most represented value in that array which is 56 times. 1476 refers to the number of rows.

mode is a single value. the other col, 0-1475 are the index numbers for each row in the rent column (think of it like a single column in a spreadsheet or table), 3500 = most represented price, 56= the number of times it appears in the array.

Why does print(manhattan_mode) return this: ModeResult(mode=array([3500]), count=array([56])) instead of its single value?

I also still donât understand why a single value is being accessed like this: manhattan_mode[0][0] or manhattan_mode[1][0]

Also, the scipy.stats.mode link says that stats.mode, âReturn an array of the modal (most common) value in the passed array.â, not a single value. Is there a way to view this array it returned?

$3500 is a single value. It is the one that is represented most frequently in the rental column for Manhattan. 3500 appears 56x in that column of data. You could pull out each row where 3500 is listed, and that would be, [3500, 3500, 3500, etc]

*Also, when I try print(manhattan_mode), there isnât a result in my learning environment.

Youâre not really supposed to be concerned with the try and except statements below your code. But, each row has an index and that âcolâ is at 0, the rents are at index 1.

The lesson is on central tendency, mean, median, mode & the distribution of this data. This particular column has one mode, or is unimodal. Letâs not confuse things.

But, sure, a data set can have more than one value that is most frequently represented. It can have no mode, or be: unimodal (one mode), bimodal (two modes), trimodal( three modes), or multimodal (four or more modes).

Valid question as realistic data tends to be more complex. When there are multiple modes, pandas handles it by choosing the mode value that is the lowest (in number or string length).

There appears to be solutions to having all the modes returned, from StackOverflow:

I understand that mode is supposed to be a single value, I understand the concepts of mean, median, and mode.

What I have been asking is why Python returns an array after using stats.mode, which is evident by the need to access manhattan_mode not just as a variable but as a list within a list (manhattan_mode[0][0]).

The more code I understand, the better programmer I will be, no?

So, it looks like the answer is that Python returns the single mode value as a single line array. Thanks!

I need some help if any one can past the whole code that I can compare with my code. I went through all the process but still the next option is inactive.