What is cross tabulation?


#1

Question

In the context of this exercise, what is cross tabulation?

Answer

In general, cross tabulation is a way to analyze relationships and other information about data.

An example of using cross tabulation is shown in this exercise, which shows the labels and the number of points that actually fall under each species.

labels    setosa    versicolor    virginica
0              0             2           36
1             50             0            0
2              0            48           14

We can see, based on this cross tabulation, how accurate the results were, and understand it a lot easier and more clearly than if we had gone through all the rows in the dataset and determined the accuracy in that manner.

Using the pd.crosstab() method in Pandas, we can perform cross tabulation on our data, and by default, it will provide a frequency table. However, we can also apply aggregate functions to the data to analyze information other than the frequencies. For example, we can get the average of values by providing the np.average function to the aggfunction parameter of the method, like so,

pd.crosstab(..., aggfunction=np.average)