Question
When should I create a Pandas dataframe using a dictionary or a list?
Answer
You can create Pandas dataframes using either a dictionary or a list of lists, but depending on several factors, using one can be preferred over the other.
Using dictionaries can be much faster, since you can just include the column names as the keys, and include the values of the column as a list for the keys. However, a disadvantage of using a dictionary is that the columns will not preserve the order that you entered them, and will default to alphabetical ordering instead. This is important to keep in mind especially if the column order is important.
Using a list of lists allows you to enter each row of data one at a time as a separate list, but it may take longer than using a dictionary, since column names must be added as a separate list after the rows are added. However, a list of lists allows you to order the column names specifically, which can be very important.
7 Likes
Thanks! Is there a way to use dictionaries and order the columns afterwards?
2 Likes
Basically create a reordered list including all the column names and then use the list as an index when creating a copy of your df.
also this: Can we select columns of a dataframe in any order?
3 Likes
I was just thinking that…it’s so easy and saves time as you people have already pasted the links at the bottom of the exercises. Codecademy is the best…there’s no doubt about that. You are amazing @jephos249.
1 Like
Each question was replied based on a concept!
For example: I came here from a link explained the concept of using Pandas.
So even I understand the concept but I always click to check out the “Community question”
:
2 Likes
I think that for now (since python 3.7) the advantage like order for list is came to end, 'cause dictionary keys is ordered now too.
1 Like
For now, both methods are valid and can be chosen based on the specific requirements of data and the operations to perform. Basically, I use a Dictionary when I want to create DataFrame with named columns directly and want the flexibility to modify the DataFrame structure easily. But if I deal with simple sequential data and just want better performance and memory efficiency, List would be more convenient.
1 Like