Are there guidelines to splitting a table as provided in the example?



In the context of this exercise, are there any guidelines and reasons for splitting the table as shown in the example?


Yes, although there might not necessarily be a one-fits-all solution to every possible database structure, there are some guidelines that can help you when you need to split a table into multiple tables.

A concept that most programmers use to refer to restructuring a database in this manner is “Database Normalization”. A very brief explanation of this concept is that it aims to accomplish that databases tables are structured so that they avoid data redundancy and keep the data accurate and consistent. “Data redundancy” is when we have a lot of repeated, or redundant, data.

In the example given in this exercise, it explains how we might split a table into multiple tables, which is essentially doing normalization. Initially, the table has many columns, and as a result, some of the values will most likely be repeated many times, which introduces data redundancy.

Say for example, we added 1 million rows to this table. Some values like customer_address might end up being stored many thousands of times, when we really only need to store it once per customer.

To avoid this issue, we would split the table so that the information is stored more concisely. If we need the customer information, we can obtain it from the customers table, by their customer_id. And, if we need the information for a subscription, all of its information is stored in the subscriptions table. We only need the customer_id and subscription_id in the orders table, and we can obtain their information from their respective tables.