What are some reasons that you might return to a previous step in the data science process?



In the context of this exercise, which covers the data science process, what are some reasons you might return to a previous step as you are working through the process?


While working through the data science process, it is common that scientists might have to return to a previous step or repeat some steps many times. It is also possible that the entire process must be repeated or redone from the beginning.

There can be many reasons for this.

Usually, when working through the process, there can be errors or some new insights at a certain step which requires going back to a previous step. For instance, when determining the necessary data, the team might realize that the initial question is not feasible anymore when looking at the required data. Because of this, they have to go back and reword or change the question.

Or, as they are cleaning and organizing the data, they realize that the data is too sparse, and cannot give good insights. This would require them to go back and redo the data collection step.

In addition, when working for a company, steps in the process might need to be repeated because of things like costs, time, the company vision, and other reasons that may or may not be in control of the scientists.