hi guys!
I have a question. I know when we use .dictreader() we use csv module to make a dictionary out of the csv file, so it’s easier to use for programming, but I don’t get it how does this dictionary look like exactly. first, I thought it uses titles in the first line to make keys, and then unzip the rest into these keys based on their index. but now, I see it like this, which doesn’t make sense.
import csv
with open("cool_csv.csv") as cool_csv_file :
cool_csv_dict = csv.DictReader(cool_csv_file)
for i in cool_csv_dict:
print(i["Cool Fact"])
if it worked like i said before, wouldn’t i be iterating through the keys? but it iterates through more items than expected by me, so how does the dictionary actually look like?
What’s the output of your code? Is “Cool Fact” the key/column name? DictReader creates a dictionary object for each row and maps each row value to a column name (key).
the “i” iterates through rows and the output is key:value pairs of each row in the dictionary.
wait, like a nested dict? idk if that applies to dicts, but like a dict inside another, or a dict lists(each line/ row) and another list inside? with for example key:values as cool fact, names, etc ? wait, it can only be nested dict going that way, lists don’t work key:value format.
is that what it is?
Sort of, but not a dict nested inside another dict. Follow @lisalisaj’s link to the documentation for more information, but csv.DictReader() returns a reader object (specifically, a csv.DictReader object). That object is iterable. Each of its elements is a dictionary containing the data from a single row of the csv where the field names are the keys. If you needed the reader object to be a list of dictionaries, you could convert it using the list() constructor.
I suppose that is one way to visualize it, but remember that the csv.DictReader object is not a dictionary. It is not a list either. It is its own class with its own properties and methods. You could convert it to a list or dictionary if you chose to.
After a bit more research, it seems that it is an iterator, and not an iteratable. Looks like it reads a line at a time, and returns a dictionary. So, I guess the csv.DictReader object itself isn’t a data container with a collection of dictionaries in store, but rather a means to return each line of the csv as a dictionary when called upon. Thanks, @mtf for clarifying.
Any time. The proviso is that the CSV has a header row derived from the column headings in the spreadsheet. The class reads the first row and caches it as the key names in each generated dictionary. When it reads in subsequent rows it pairs up the data with the cached keys.
When we consider a CSV file there is nothing to know except that the entire file is TXT with the exception of the one special character, the comma. Well, the newline would be a special character, too. So two special characters. That means the row contains only data fields. Kind of moot, but worth keeping in mind.
I’ll have to set up a test set but on the basis of the CSV being textual data, we would need to do conversion before we can do any math operations.
Aside
One suspects that we could have two streams open in with constraints, one for reading , one for writing and do it all seamlessly without declaring any data structures, or minimal, at least, that would be volatile/reusable. Iterators are designed for optimized code in transient scenarios, the way I see them.
As you will already know, files are volatile and there is strict protocol for even appending them, let alone changing anything in them. Lose a few files and this becomes burned in the memory. Maybe it is my feebleness and fear that has kept me from mastering external files, not just the fact I don’t work in that environment.
If one file is opened for reading, it should only be for reading. When the rows are read in and processed into new data rows, they get sent out to a new file. Is this what they would call a CRON job? The running sales report is read in each day, and then updated with the day’s sales information and written out to a new file that will be called up the next day to follow up on the process.
ok so let’s see if i got this right, this code gets the first line, creates a dict out of it, with unset values. and every time it has to work with the values, it gets the line, unzips the items with delimiters between with preset keys based on their index. and then returns the value?
is that right? idk it looks somewhat unstable…
Above I was more speculating how the class (csv.DictReader) might parse a CSV generated by Excel or any spreadsheet, for that matter. The fieldnames are expected to be the first row of the file, as I recall (from twenty years ago).
The writer has fieldnames specified in the build process:
fieldnames = ['first_name', 'last_name']
The reader looks to have a fieldnames parameter but I’m not sure at this time how or what sequence we populate it with if we haven’t read the file. This is the confusing part for me. I’ve no problem connecting the dots between the fieldnames and the data rows:
Each row read from the csv file is returned as a list of strings.
Each string is mapped to the fieldnames in the order in which they are written.
If you have time to explore this more extensively, then take this opportunity to do so and show us what you’ve come up with. One could spend a week on this topic given its importance in the overall file picture. One expects this will be first hand knowledge if working with CSV on a regular basis.
(not exactly like this, i wrote that for visualization. like gets first line as keys, for each line makes a dictionary, set those dictionaries inside another dictionary, then does whatever we asked?)
Okay, it would indicate that this unit is outside of your knowledge scope. Suggest pause and go through the unit on Python iterators and how they work. It would be pointless for me to try to teach that to you here. Also, go through the unit on Python exceptions and exception handling. One really needs to have proficiency in both concepts to properly understand and implement the csv.DictReader class. Sorry, I have to step out…
If you are a Pro subscriber you can find the unit in the Learn Python 3 track. I recommend doing that track from start to finish, and put your learning path on hold (assuming you have a learning path, my bad). One should really have a complete refresher of Python before engaging in the CS or DS or any advanced path. It can only prove as time well spent. Expect it to take a couple of weeks, but I’m sure it will save you a lot more than that when you return to the career path offerings.