Missing at Random (MAR) practice question

In the first module of the data science path one learns about different types of missing data. In the practice session, this particular question came up.

Why is this MAR and not MCAR?

The cheatsheet for the module is here.

Maybe those participants just didn’t want to provide their weights.(?) ie: some random characteristic about the person or thing being studied.

1 Like

As far as I understand it, the difference between the three types of missing data presented in the course is as follows:

  • Missing Completely at Random (MCAR): There is no discernible pattern to the missing data, and we don’t know the underlying reason for it missing.

  • Missing at Random (MAR): There is a pattern to the missing data, but we still don’t know the underlying reason for it missing (though we can make guesses based on the pattern).

  • Structurally Missing Data: There is a pattern to the missing data, and through common knowledge about the world and/or logic, it is evident why it is missing (e.g. males not being capable of giving birth, sport teams with no losses can’t have a ‘games since last loss’ stat).

Even with MCAR data you can make some guesses as to the underlying reason for the missing data, but since there is no discernible pattern, we’re essentially just tossing out guesses in the dark, and it doesn’t change it from being MCAR.